FiLM — Feature-wise Linear Modulation

Implementation of Perez et al. (2017) with an interactive Streamlit application covering three use cases: visual question answering on Sort-of-CLEVR and CLEVR, and artistic style transfer via Conditional Instance Normalisation.

What is FiLM?

A common challenge in deep learning is conditioning — adapting a network's behavior based on external information (a question, a style, a class label).

FiLM addresses this with a simple and general idea: instead of concatenating the context to the inputs, it transforms it into scale and shift parameters γ and β that directly modulate the CNN feature maps:

$$\text{FiLM}(F_{i,c}) = \gamma_{i,c} \cdot F_{i,c} + \beta_{i,c}$$

γ amplifies, reduces or suppresses a feature map
β shifts activations up or down
Both are produced by a lightweight network (the FiLM generator) from the conditioning input (e.g. the question)

In practice we use $\gamma = 1 + \Delta\gamma$ so the model starts close to identity and avoids gradient issues at the start of training.

Quickstart

pip install -r requirements.txt
python -m streamlit run app.py

Everything runs from the interface — no further terminal interaction needed.

Data & Weights

Data and model weights are not included in the repo (too large). They are hosted on Google Drive and can be downloaded directly from the app.

(Google Drive)

Dataset	Size	How to get it
Sort-of-CLEVR	~200 MB	Button in the app
Style Transfer	~400 MB	Button in the app
CLEVR VQA	~18 GB	Manual (see below)

App Pages

Sort-of-CLEVR

2D Kaggle dataset of colored shapes with 11 answer classes. The question is encoded in 10 dimensions and passed to the FiLM generator, which modulates the CNN feature maps. You can train from scratch, load a pretrained model, and test visually on generated scenes.

CLEVR VQA

Full implementation of the paper's architecture. The dataset is ~18 GB so interactive training is not available — the app displays the learning curves from our own run (~40k iterations).

To reproduce:

# Preprocess questions
python -m clevr.scripts.preprocess_questions \
  --input_questions_json CLEVR_v1.0/questions/CLEVR_train_questions.json \
  --output_h5_file clevr/data/train_questions.h5 \
  --output_vocab_json clevr/data/vocab.json

# Extract ResNet101 features
python -m clevr.scripts.extract_features \
  --data-dir clevr/data/clevr --split train

# Train
python -m clevr.scripts.train_model --model_type FiLM \
  --checkpoint_path clevr/data/film_checkpoint.pth \
  --batch_size 64 --num_iterations 100000 --loader_num_workers 0

Style Transfer

Implementation of Ghiasi et al. (2017) — same FiLM conditioning idea applied to artistic style via Conditional Instance Normalisation. 6 styles available with interactive inference from the app.

Results

Dataset	Validation Accuracy
Sort-of-CLEVR	~94% (10 epochs)
CLEVR VQA (our run, 40k iters)	~51%

The gap with the paper on CLEVR VQA is mainly due to a reduced hidden_dim (256 vs 4096) and a limited number of iterations.

Repository Structure

FiLMProjet/
├── app.py
├── pages/
│   ├── 0_Présentation.py
│   ├── 1_Sort_of_CLEVR.py
│   ├── 2_CLEVR_VQA.py
│   └── 3_Style_Transfer.py
├── sortofclevr/          # dataset, model, training
├── style_transfer/       # dataset, model, training
├── clevr/
│   ├── core/             # data, embedding, preprocess, utils
│   ├── models/           # film_net, film_gen, baselines, layers
│   ├── scripts/          # train, preprocess, extract features
│   └── data/             # vocab, h5 questions, result logs
├── assets/
└── requirements.txt

References

Perez et al. (2017) — FiLM: Visual Reasoning with a General Conditioning Layer
Johnson et al. (2017) — CLEVR: A Diagnostic Dataset for Visual Reasoning
Ghiasi et al. (2017) — Exploring the structure of a real-time, arbitrary neural artistic stylization network
Original CLEVR codebase: github.com/ethanjperez/film

Authors

Iliès Chenene
Valentin Porlier

Name		Name	Last commit message	Last commit date
Latest commit History 189 Commits
assets		assets
clevr		clevr
docs		docs
pages		pages
sortofclevr		sortofclevr
style_transfer		style_transfer
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
app.py		app.py
presentation.py		presentation.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FiLM — Feature-wise Linear Modulation

What is FiLM?

Quickstart

Data & Weights

App Pages

Sort-of-CLEVR

CLEVR VQA

Style Transfer

Results

Repository Structure

References

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Languages

Folders and files

Latest commit

History

Repository files navigation

FiLM — Feature-wise Linear Modulation

What is FiLM?

Quickstart

Data & Weights

App Pages

Sort-of-CLEVR

CLEVR VQA

Style Transfer

Results

Repository Structure

References

Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Languages

Packages