Personal Utils (perutils)

Notebook -> module conversion with #export flags and nothing else

Purpose: The purpose and main use of this module is for adhoc projects where a full blown nbdev project is not necessary

Example Scenario

Imagine you are working on a kaggle competition. You may not want the full nbdev. For example, you don't need separate documentation from your notebooks and you're never going to release it to pip or conda. This module simplifies the process so you just run one command and it creates .py files from your notebooks. Maybe you are doing an ensemble and to export the dataloaders from a notebook so you can import them into seperate notebooks for your seperate models, or maybe you have a seperate use case.

That's what this module does. it's just the #export flags from nbdev and exporting to a module folder with no setup (ie settings.ini, __nbdev.py, etc.) for fast minimal use

Install

pip install perutils

How to use

Shelve Experiment Tracking

This module is designed to assist me in tracking experiments when I am working on data science and machine learning tasks, though is flexible enough to track most things. This allows for easy tracking and plotting of many different types of information and datatypes without requiring a consistent schema so you can add new things without adjusting your dataframe or table.

General access to a shelve db can be reached in one of two ways and behaves similar to a dictionary.

with shelve.open('test.shelve') as d: 
    print(d['exp'])

d = shelve.open('test.shelve')
print(d['exp']
d.close()

This module assumes a certain structure. If we assume: d = shelve.open('test.shelve')

assert type(d[key]) == list
assert type(d[key][0]) == dict

Additionally:

keys in an experiment (d['exp'][0][key] must be strings but the values can be anything that can be pickled
Plotting functions assumes the value you want to plot (ie d['exp][0]['batch_loss'] is list like and the name (for the legend) is a string

Create and Add Data

The process is:

Create a dict with all the information
Append dict to database

This will create filename if it does not exist

append(filename,new_dict)

{% include note.html content='You can write individual elements at a time as well just like you would in a normal dictionary if that is preferred.' %}

Delete

-1 can be replaced with any index location.

delete(filename,-1)

What keys are available?

print_keys(filename)

What were the results?

el,ea,bl = get_stats(filename,-1,['epoch_loss','epoch_accuracy','batch_loss'],display=True)

Find the experiment with the best results.

print_best(filename,'epoch_loss',best='min')
print_best(filename,'epoch_accuracy',best='max')

Graph some stats and compare results

graph_stats(filename,['batch_loss','epoch_accuracy'],idxs=[-1,-2,-3])

nb -> py

Full Directory Conversion

In python run the simple_export_all_nb function. This will:

Look through all your notebooks in the directory (nbs_path) for any code cells starting with #export or # export
If any export code cells exist, it will take all the code and put it in a .py file located in lib_path
The .py module will be named the same as the notebook. There is no option to specify a seperate .py file name from your notebook name

Any .py files in your lib_path will be removed and replaced. Do not set lib_path to a folder where you are storing other .py files. I recommend lib_path being it's own folder only for these auto-generated modules

simple_export_all_nb(nbs_path=Path('.'), lib_path=Path('test_example'))```

#### Single Notebook Conversion

In python run the `simple_export_one_nb` function.  This will:

+ Look through the specified notebook (nb_path) for any code cells starting with `#export` or `# export`
+ If any export code cells exist, it will take all the code and put it in a .py file located in `lib_path`
+ The .py module will be named the same as the notebook.  There is no option to specify a seperate .py file name from your notebook name


```python
simple_export_one_nb(nb_path=Path('./00_core.ipynb'), lib_path=Path('test_example'))```

### py -> nb

#### Full Directory Conversion

In python run the `py_to_nb` function.  This will:
+ Look through all your py files in the `py_path`
+ Find the simple breaking points in each file (ie when new functions or classes are defined
+ Create jupyter notebooks in `nb_path` and put code in seperate cells (with `#export` flag)

**This will overwrite notebooks in the `nb_path` if they have the same name other than extension as a python module**

```python
py_to_nb(py_path=Path('./src/'),nb_pth=Path('.')```

### kaggle dataset

#### Uploading Libraries

```python
if __name__ == '__main__':
    libraries = ['huggingface','timm','torch','torchvision','opencv-python','albumentations','fastcore']

    for library in libraries: 
        print(f'starting {library}')
        dataset_path = Path(library)
        print("downloading dataset...")
        download_dataset(dataset_path,f'isaacflath/library{library}',f'library{library}',content=False,unzip=True)
        print("adding library...")
        add_library_to_dataset(library,dataset_path)
        print("updating dataset...")
        update_datset(dataset_path,"UpdateLibrary")

        print('+'*30)

Custom dataset (ie model weights)

dataset_path = Path(library)
dataset_name = testdataset
download_dataset(dataset_path,f'isaacflath/{dataset_name}',f'{dataset_name}',content=False,unzip=True)
# add files (ie model weights to folder
update_datset(dataset_path,"UpdateLibrary")

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
conda/perutils		conda/perutils
docs		docs
perutils		perutils
test_example		test_example
.devcontainer.json		.devcontainer.json
.gitignore		.gitignore
01_nbutils.ipynb		01_nbutils.ipynb
02_tabutils.ipynb		02_tabutils.ipynb
03_Tracking.ipynb		03_Tracking.ipynb
04_kaggle.ipynb		04_kaggle.ipynb
05_splitting.ipynb		05_splitting.ipynb
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
index.ipynb		index.ipynb
settings.ini		settings.ini
setup.py		setup.py
test.txt		test.txt

License

Isaac-Flath/perutils

Folders and files

Latest commit

History

Repository files navigation