Skip to content

hdr-bgnn/iqm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Source code for "Biodiversity Image Quality Metadata Augments Convolutional Neural Network Classification of Fish Species" (Manuscript) presented at MTSR2020.

Organization:
├── HGNN
│   ├── analyze
│   │   ├── analyze_experiments.ipynb                              Confusion matrices
│   │   ├── analyze_trial.ipynb                                    
│   │   ├── calculate_random_accuracy.ipynb                        
│   │   ├── correlation_of_features.ipynb                          
│   │   ├── flashtorch_modified.py                                 
│   │   ├── obtain_misclassified_examples.ipynb                    Extract misclassified and look at quality factors
│   │   └── plot_network.ipynb                                     
│   ├── dataPrep
│   │   ├── Segment_images.ipynb                                   
│   │   ├── copy_image_subset.ipynb                                Useful for subsets (called "suffix" for some reason)
│   │   ├── create_cleaned_metadata.ipynb                          metadata -> cleaned_metadata
│   │   ├── generate_augmented_data.ipynb                          rotate and such
│   └── train
│       ├── CNN.py                                                 Where the rubber meets the road
│       ├── CSV_processor.py                                       Parse a experiment-wide result and metadata file
│       ├── ConfigParserWriter-QualityExperiments.ipynb            Generate config for the quality experiments
│       ├── configparser_all_graded_images.ipynb                   Generate config for the whole corpus experiment
│       ├── cifar_nin.py                                           
│       ├── configParser.py                                        Parse an individual experiment config
│       ├── dataLoader.py                                          Train/test manager
│       ├── resnet_cifar.py                                        
│       └── train.py                                               Run everything
├── metadata
│   ├── HDR-IMG_2020.01.27.txt                                     All INHS teleost taxa
│   ├── hdrwebform_20200728.csv                                    Manually curated quality metadata
│   ├── high_qual_metadata.tsv                                     Taxonomically balanced high quality subset
│   ├── low_qual_metadata.tsv                                      Taxonomically balanced low quality subset
│   └── orig_cleaned_metadata.tsv                                  Fish with at least 10 individuals

Install dependencies: Using pip

virtualenv .
pip install -r pip_requirements.txt

or using Conda

conda create -n hgnn
conda activate hgnn
conda install conda_requirements.yaml

Install myhelpers:

pip install git+ssh://git@github.com/hdr-bgnn/myhelpers.git

Setup paths and parameters:

Alter the config.default.ini to create a config.ini in the HGNN directory

[general]
experimentsPath = /home/elhamod/HGNN/experiments/
dataPath = /home/elhamod/projects/HGNN/data
experimentName = BestModel
fine_csv_fileName = metadata.csv

[dataPrep]
image_path = INHS_cropped
output_directory_name = 52
cleaned_species_csv_fileName = cleaned_metadata.csv
species_csv_fileName = metadata.csv
numberOfImagesPerSpecies = 50

[augmented]
applyHorizontalFlip = True
applyHorizontalFlipProbability = 0.1
applyTranslation = True
applyTranslationAmount = 0.1
applyColorPCA = True
colorPCAperturbationMagnitude = 0.1
numOfAugmentedVersions = 10
cuda = 0
    

[train]
modelFinalCheckpoint = finalModel.pt
statsFileName = stats.csv
timeFileName = time.csv
epochsFileName = epochs.csv
paramsFileName = params.json


# cleaned up metadata file that has no duplicates, invalids, etc
cleaned_fine_csv_fileName = cleaned_metadata.csv

# Saved file names.
statistic_countPerFine = count_per_fine.csv
statistic_countPerFamilyAndGenis = count_per_family_genus.csv

experimentsFileName = experiments.csv

[metadatatableheaders]
fine_csv_fileName_header = fileName
fine_csv_scientificName_header = scientificName
fine_csv_Coarse_header = Genus
fine_csv_Family_header = Family

[dataloader]
testIndexFileName = testIndex.csv
valIndexFileName = valIndex.csv
trainingIndexFileName = trainingIndex.csv
paramsFileName = params.json
  • Have your data with metadata.csv in a directory

  • Create a config file using HGNN/train/ConfigParserWriter-*

  • train a model using train.py. e.g.: python3 train.py --cuda=7 --name="learningRateTest" --experiments="/home/elhamod/HGNN/experiments/" --data="/data/BGNN_data"

  • analyze the data using jupyter notebooks under HGNN/analyze/ .e.g Analyze trial, Analyze experiments.

  • Once trained and analyzed, an experiment folder with name / will have been created, with a "models" folder that has all the trained model, a "results" folder with the analysis, and a "datasplits" folder with the indexes of images used for train/val/test.

About

source code for "Biodiversity Image Quality Metadata Augments Convolutional Neural Network Classification of Fish Species

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages