Neural Network Training and Embedding Construction for Typeface Images

This repository holds the codes to train the neural network and produce the embeddings for fonts. The embeddings can be used to recognize fonts. The embeddings will be used in https://github.com/ericschulman/fonts_causal_analysis for causal economic analyses. Refer to Han et al. (2020) for details of the neural network training and embedding construction.

File structure/files

This repository should have external folders with the data. Here is an example of folder structure under a name fonts_project.

fonts_project    
└───datasets
│   └─── raw_pangrams
│   └─── crop7_test
│   └─── crop7_train
│   └─── main_dataset
│   │   │ Style Sku Family.csv
│   │   │ ...
└───models
└───logs
└───fontnet

We run the code in this repository using Anacondas with Python 3.7 on Ubuntu 18.03. For TensorFlow, version 1.7 or better is required. Install TensorFlow via conda install tensorflow.

Preprocessing

Run preprocessing.sh. This should create the necessary cropped data from original pangram bmp images.

Training

Run train.sh. We trained until the loss function is between .6-.8. Results may vary. It took us about 36 hours on relatively weak hardware, i.e., I5-6260U CPU @ 1.80GHz × 4 and 16 GB RAM.

Cross-validation

First run gen_pairs.sh. This should create the necessary data for cross-validation. The pairs.txt files will appear in the folder with the test data. There are 2 sets:

Easy, this is generated by specifying --diff_style 0.
Hard, this is generated by specifying --diff_style 1. We test whether the fontnet is trained to recognize font families and not just styles.

Then run validate.sh. This should display statistics about the trained model. You will need to specify the model and log directories. The relevant folders are generated by training a model.

Saving the embeddings

Run write_embeddings.sh. You will need to specify the model and log directories. The relevant folders are generated by training a model. The result of this script will appear in the main_dataset folder. Without modifying the code, the file will be called embeddings_full.csv.

References

License

The codes and the dataset (separately shared) for this repository are protected by the Creative Commons non-commerical no-derivative license.

Name		Name	Last commit message	Last commit date
Latest commit History 614 Commits
contributed		contributed
data		data
preprocess		preprocess
src		src
test		test
tmp		tmp
util		util
.gitignore		.gitignore
.project		.project
.pydevproject		.pydevproject
.pylintrc		.pylintrc
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
README.md		README.md
__init__.py		__init__.py
gen_pairs.sh		gen_pairs.sh
preprocess.sh		preprocess.sh
requirements.txt		requirements.txt
train.sh		train.sh
validate.sh		validate.sh
write_embeddings.sh		write_embeddings.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Network Training and Embedding Construction for Typeface Images

File structure/files

Preprocessing

Training

Cross-validation

Saving the embeddings

References

License

About

Releases

Packages

Languages

License

ericschulman/fontnet

Folders and files

Latest commit

History

Repository files navigation

Neural Network Training and Embedding Construction for Typeface Images

File structure/files

Preprocessing

Training

Cross-validation

Saving the embeddings

References

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages