If a project is well organized, with everything placed in one directory, it makes it easier to avoid wasting time searching for project files such as datasets, codes, output files such as machine learning models, and so on. A well-organized project helps you to keep and maintain a record of your ongoing and completed data science projects.

## Create file path json file

The pprint module in Python stands for "pretty-print" and is part of the standard library. It provides a way to print data structures in a more readable and aesthetically pleasing format compared to the default print function

In [4]:
from pprint import pprint

FPATHS = dict(
    data={
        "raw": {
            "full": "data/ames-housing-dojo-for-ml.csv",  
            "eda": "data/ames-housing-dojo-for-ml-eda.csv" 
        },
        "ml": {
            "train": "data/training-data.joblib",  
            "test": "data/testing-data.joblib", 
        },
        "nlp": {
            "review_data": "Data/nlp/movie_reviews.csv"} # raw movie review data 
            
    },
    models={
        "linear_regression": "models/linear_regression/linreg.joblib", 
        "random_forest": "models/random_forest/rf_reg.joblib", 
    },
    images={
        "banner": "images/app-banner.png", 
    },
)
pprint(FPATHS)



{'data': {'ml': {'test': 'data/testing-data.joblib',
                 'train': 'data/training-data.joblib'},
          'nlp': {'review_data': 'Data/nlp/movie_reviews.csv'},
          'raw': {'eda': 'data/ames-housing-dojo-for-ml-eda.csv',
                  'full': 'data/ames-housing-dojo-for-ml.csv'}},
 'images': {'banner': 'images/app-banner.png'},
 'models': {'linear_regression': 'models/linear_regression/linreg.joblib',
            'random_forest': 'models/random_forest/rf_reg.joblib'}}


In [5]:
# save this file path in config folder to use it for deployment

import os, json
os.makedirs('config/', exist_ok=True)
FPATHS_FILE = 'config/filepaths.json'
with open(FPATHS_FILE, 'w') as f:
    json.dump(FPATHS, f)




In [6]:
# using custom function create the dictionaries using this path

# first need to import autorelaod function 
%load_ext autoreload
%autoreload 2
import custom_function as fn

In [8]:
fn.create_directories_from_paths(FPATHS)

Directory created: Data/nlp
Directory created: models/linear_regression
Directory created: models/random_forest


In [9]:
# Lets checck fpath

with open('config/filepaths.json', 'r') as f:
    path = json.load(f)

In [10]:
path.keys()

dict_keys(['data', 'models', 'images'])

In [12]:
path['data']['nlp']

{'review_data': 'Data/nlp/movie_reviews.csv'}