Mahalangur: Summiting the Himalayas

Mahalangur is a small Python data science project that demonstrates how to design Scikit Learn models with a replicable project structure and how to create a simple web API and visualization in Flask. The project structure is largely based on the Cookiecutter Data Science template and is outlined in the Project Organization section below.

Visualization Demo

A demo of the web interface for Mahalangur is hosted on PythonAnywhere:

Acknowledgement

This project is named after the Mahalangur Himal, a section of the Himalayas that contains four of the six tallest mountains - including Mount Everest. The expedition data is sourced from the Himalayan Database.

License

This project is ISC licensed. However, the climb data is sourced from the Himalayan Database - please reach out to them if you wish to use their data for anything other than personal use.

Project Organization

The repository is a Python project using the following folder structure:

Mahalangur
├── LICENSE
├── README.md          <- The top-level README for developers using this project
├── Makefile           <- Makefile with commands like `make install_requirements`
│
├── mahalangur
│   ├── assets         <- Serialized models that are to be distributed with the
│   │                     package
│   │
│   ├── data           <- Code for downloading or generating raw data
│   │   ├── metadata   <- Static metadata such as code tables
│   │   └── sql        <- Database definitions for SQLite datastore
│   │
│   ├── feat           <- Code to turn raw data into features for modelling
│   ├── web            <- Flask API and web visualization code
│   └── rfmodel.py     <- Code for training the model
│
├── notebooks          <- Jupyter notebooks used for exploring/analysing the data and
│                         for prototyping models
│
└── references         <- Data dictionaries, manuals and other explanatory materials

Usage

Installation

Clone Mahalangur to a folder of your choice:

git clone https://github.com/trthatcher/Mahalangur.git
cd Mahalangur

Next, create the mahalangur conda environment and install the requirements:

make environment
conda activate mahalangur
make install_requirements

This will install the package and its dependencies.

Getting the Latest Data

By default, this package will download training data to a .mahalangur directory in your home directory. You can override this by setting a MAHALANGUR_HOME environment variable to the directory of your choosing. The Mahalangur data directory is laid out in the following way:

.mahalangur
│
├── data
│   ├── raw            <- Raw data is downloaded to this directory
│   ├── processed      <- Processed data is stored in this directory
│   └── mahalangur.db  <- This database is created to store the processed data
│
├── metadata           <- Processed metadata is output here
│
└── models             <- Serialized models are output here

To download the latest version of the Himalayan Database, run the following command in the terminal:

make dataset

This will populate the .mahalangur/data directory with updated extracts and transfer them into the mahalangur.db SQLite database. An updated model can be created by running:

make model_rf

The model will be stored in the .mahalangur/models directory. Note that if you would like to update the model used by the package, you will need to transfer it to the assets directory in the package.

Starting the API

Once the package has been cloned and installed, you can run the web visualization locally with the following command:

make api

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
mahalangur		mahalangur
notebooks		notebooks
references		references
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
requirements-optional.txt		requirements-optional.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mahalangur

mahalangur

notebooks

notebooks

references

references

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE.md

LICENSE.md

Makefile

Makefile

README.md

README.md

requirements-optional.txt

requirements-optional.txt

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

Mahalangur: Summiting the Himalayas

Visualization Demo

Acknowledgement

License

Project Organization

Usage

Installation

Getting the Latest Data

Starting the API

About

Releases

Packages

Languages

License

trthatcher/Mahalangur

Folders and files

Latest commit

History

Repository files navigation

Mahalangur: Summiting the Himalayas

Visualization Demo

Acknowledgement

License

Project Organization

Usage

Installation

Getting the Latest Data

Starting the API

About

Topics

Resources

License

Stars

Watchers

Forks

Languages