Skip to content
Jupyter Notebook Other
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Update template files. Dec 27, 2019
CinemaNet don't add sigmoid/softmax by deault as it messes up the loss function Feb 15, 2020
Labels
Models Clean up repo a touch. Dec 12, 2019
Utilities filename fixes Jan 30, 2020
docs nbdev autogenerated docs Jan 26, 2020
nbs nb keeps crashing. retry on diff machine Feb 17, 2020
old Move this too Jun 10, 2019
.gitignore ignore irrelevant files Feb 10, 2020
CinemaNet.mlmodel Fix a stupid oversight in trying to sort dict keys which invalidated … Dec 30, 2019
LICENSE Re- commiting Cinema Net from TRASH archives sans history :( May 15, 2019
Labels.md Minor mistake Jan 10, 2020
MANIFEST.in port and adapt nbdev_template for CinemaNet Jan 26, 2020
Makefile port and adapt nbdev_template for CinemaNet Jan 26, 2020
README.md Update README.md Jan 20, 2020
Running Training Notes.md Update Running Training Notes.md May 31, 2019
Synopsis_Train_Multilable_CSV_Dataset.ipynb Add our Google Colab training script. Jan 10, 2020
coreml_pipeline_creator.py Add date time to description. Dec 29, 2019
settings.ini port and adapt nbdev_template for CinemaNet Jan 26, 2020
setup.py port and adapt nbdev_template for CinemaNet Jan 26, 2020

README.md

CinemaNet

CinemaNet is a set of data and trained models to help run inference to classify images / frames of a video with an eye for photographic, cinemgraphic, composition and color labelling.

CinemaNet aims to give out of the box useful classification of images / frames of video to cinematographers, editors, archivists, and anyone interested in extracting classification in a cinema / video context.

The Labels

The CinemaNet project aims to make a quasi knowledge graph of visual concepts useful to cinematographers, photographers, artists, designs, illustrators - and as such has labels ranging from composition theory to shot locations. The first round of label concents and categories is meant to provide an immediately helpful set of concepts and provide a baseline for the future. Note our label naming scheme uses a reverse DNS system - where top level naming helps to provide context for interpreting categories, concepts and sub-concepts.

The most up to date list of active labels can be viewed here:

See Labels.md for information on the taxonomy, list of categories and their concepts (might be slightly out of date).

The Data Set

Note, the raw data set imahges are only useful if you plan on training your own models or are interested in helping optimize, classify and iterate on the quality of the models. Generally speaking its probably not needed!

1: Check out the repository

Ensure that you have git checked out this repository or have done a download of this repository via the green clone or download button on the project page.

2: Install the dependencies for our data set download script:

Ensure you have PIP installed. Install Google Image Downloader and install Google Chrome Driver with a matching version to your currently installed Google Chrome browser (for me, it was 74.x). Google Chrome Driver is required to download more than 100 images per google image query.

2.1: Install PIP if necessary.

sudo easy_install pip in your Terminal.app command line.

2.2: Install Google Image Downloader via:

pip install google_images_download in your terminal.app

2.3: Install Chrome Driver

Check that you have a version of Google Chrome installed in the defaul /Applications/Google Chrome.app location. Launch Chrome and check the version number by going to 'About Chrome' in the Chrome Menu.

Download a matching version of Google Chrome Driver and place it into the same directory as these scripts.

3.0: Downloading the Data Set

You can then run python synopsis_categories_and_concepts_image_downloader.py to get the unfiltered raw data set - which will contain some noisy / misclassified images in the training set due to how Google Images returns results.

This download should be roughly 7.5 GB and contain roughly 63 thousand images sorted into a folder structure for the label category and concepts. The data set then requires manually pruneing from irrelevant or off topic images from the folder structure.

Training your own models

You can follow along with the Running Training Notes to see the steps we are taking if you want to train yourself.

Sign up for Googles AutoML Vision cloud service if you want to train your own model. At the time of this writing you will get approximately $300 in free credits.

See Running Training Notes for more info on training a model.

Running the Auto Labeller / Model Cleaner and other utilities

If you want to use the other utilities (auto labeler / HTML prediction preview script, model metadata and label name clean up script, and label print out script) you need to install Apples coremltools python package.

You can install via:

pip install -U coremltools

You can’t perform that action at this time.