Graph Machines

Implementation of Graph Machines using python, and their application to a dataset composed of molecule.

Project Organization

├── LICENSE
├── README.md          	<- The top-level README for developers using this project.
├── data
│   ├── external       	<- Data from third party sources.
│   ├── interim        	<- Intermediate data that has been transformed.
│   ├── processed      	<- The final, canonical data sets for modeling.
│   └── raw            	<- The original, immutable data dump.
│
├── docs               	<- A default Sphinx project; see sphinx-doc.org for details
│
├── models             	<- Trained and serialized models, model predictions, or model
│			           summaries
│
├── notebooks          	<- Jupyter notebooks. Naming c
│
├── references         	<- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            	<- Generated analysis as HTML, PDF, LaTeX, generated graphics and 
│                              figures to be used in reporting
│
├── Pipfile,Pipfile.lock    <- File used for pipenv 
│ 
│
├── src                	<- Source code for use in this project.
│   ├── __init__.py    	<- Makes src a Python module
│   │
│   ├── data           	<- Scripts to download or generate data
│   │   ├── load_dataset.py
│   │   └── make_dataset.py
│   │
│   ├── scott       	<- Scripts to turn raw data into newick format
│   │  
│   │
│   ├── models         	<- Scripts to train_regression models and then use trained models to
│   │   ├── predict_model.py   make predictions
│   │   └── train_model.py
│   │
│   ├── visualization  	<- Scripts to create exploratory and results oriented visualizations
│   │      └── visualize.py
│   │
│   └── Net  	        <- Neural Network 
│       └── FNN_GM_Net.py
│
├──GM-Classification.py     <- Script for classification task
│
└──GM-Regression.py         <- Script for regression task

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Dataset

The datasets used can be downloaded from : https://brunl01.users.greyc.fr/CHEMISTRY/

Prerequisites

First of all you need to get this repo on your pc:

git clone https://github.com/elbarto91/GraphMachines-.git

The repository downloaded is ready-to-go, so you may only extract it. To execute the script(train/predict) in a separate environment, must be installed python 3.7(.5) and pipenv.

Installing

Get python3:

sudo apt-get update
sudo apt-get install python3.7

Get pipenv:

sudo pip/pip3 install pipenv

From the main folder of the project,in order to install the environment:

sudo pipenv install

In order to display the several plot, may need to install a package "tkinter" :

sudo apt-get install python3-tk

Usage

usage: GM-Regression.py [-h] [-d DEVICE] [-e NUM_EPOCHS]
                        [-hln HIDDEN_LAYER_SIZE] [-lr LEARNING_RATE]
                        [-r REPORT] [-rdd ROOTDIRDATASET] [-trf TRAINFILE]
                        [-tef TESTFILE] [-s SAVE] [-l LOAD] [-b BIAS]
                        [-rn REPORTNAME] [-mp MODELPATH]

optional arguments:
  -h, --help            show this help message and exit
  -d DEVICE, --device DEVICE
                        device to use(GPU or CPU(defualt))
  -e NUM_EPOCHS, --num_epochs NUM_EPOCHS
                        number of epochs,default=10000
  -hln HIDDEN_LAYER_SIZE, --hidden_layer_size HIDDEN_LAYER_SIZE
                        number of nodes for the hidden layer, default = 4
  -lr LEARNING_RATE, --learning_rate LEARNING_RATE
                        learning rate for the optimizer, default = 0.001
  -r REPORT, --report REPORT
                        save result in a report file
  -rdd ROOTDIRDATASET, --rootDirDataset ROOTDIRDATASET
                        directory of dataset files
  -trf TRAINFILE, --trainFile TRAINFILE
                        dataset containing the name on the trainset files
  -tef TESTFILE, --testFile TESTFILE
                        dataset containing the name on the testset files
  -s SAVE, --save SAVE  True if you want to save the model, default = False
  -l LOAD, --load LOAD  True if you want to load the model, default = False
  -b BIAS, --bias BIAS  bias value, default = 1
  -rn REPORTNAME, --reportName REPORTNAME
                        base name for the report's folder
  -mp MODELPATH, --modelPath MODELPATH
                        model's path

In order to use the environment created with pipenv you need to launch it(from the root folder of the project) :

pipenv shell

TRAIN NEURAL NETWORK

python GM-Regression.py -e 1000 -rdd data/processed/Acyclic/ -trf trainset_0.ds -tef testset_0.ds --reportName ACYCLIC --save True

PREDICT USING A TRAINED NEURAL NETWORK

(Check that yu have a saved model)

python GM-Regression.py -rdd data/processed/Acyclic/  --report True -tef testset_0.ds --reportName ACYCLIC --load True --modelPath models/ACYCLIC/model_testset_0.ds-Dvalue12-maxMValue4-Saved.pth

Built With

Pycharm - Integrated development environment (IDE)
Git - distributed version-control system for tracking changes in source code during software development.
Python 3.7.5 - Interpreted, high-level, general-purpose programming language
Pytorch - An open source machine learning framework
Jupyter Notebook - Open-source web application
Pipenv - Packaging tool for Python ####Based on:
Scott - software able to compute, for any fully-labelled (edge and node) graph, a canonical tree representative of its isomorphism class, that can be derived into a canonical trace (string) or adjacency matrix
Graph Machines and Their Applications to Computer-Aided Drug Design: A New Approach to Learning from Structured Data - Graph machines learn real numbers from graphs.

Thanks to

Authors

Giovanni Colucci - Total work - Elbarto91
Luc Brun - Supervisor - Luc Brun

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

If you want to use a different dataset, be sure to use the same layout of the dataset in processed.
Use watch -n 0.5 nvidia-smi to check the gpu status

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph Machines

Table of Contents

Project Organization

Getting Started

Dataset

Prerequisites

Installing

Usage

TRAIN NEURAL NETWORK

PREDICT USING A TRAINED NEURAL NETWORK

Built With

Thanks to

Authors

License

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.github/workflows		.github/workflows
data		data
docs		docs
models		models
notebooks		notebooks
references		references
reports		reports
src		src
.gitignore		.gitignore
GM-Classification.py		GM-Classification.py
GM-Regression.py		GM-Regression.py
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
SECURITY.md		SECURITY.md

License

coluccigiovanni16/GraphMachines-

Folders and files

Latest commit

History

Repository files navigation

Graph Machines

Table of Contents

Project Organization

Getting Started

Dataset

Prerequisites

Installing

Usage

TRAIN NEURAL NETWORK

PREDICT USING A TRAINED NEURAL NETWORK

Built With

Thanks to

Authors

License

Acknowledgments

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages