Skip to content
No description, website, or topics provided.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
__pycache__
.gitignore
README.md
file_info.py
gen_data.py
load_time_data.py
progress.py
run.py
run_data.csv
test.json
test.pickle
test.py

README.md

Data Storage Experiment for Python Native Frameworks

A generative exploration of loading time for datasets of varying complexity across single-file storage types.

Background

JSON similarity to python's dictionary data model got me wondering why we need JSON at all in python native frameworks such as Django or Flask. A dictionary object can be imported. In fact "import data" seems particulary pthonic to me. This experiment first seeks to generate JSON, .py, and pickled dictionary files of various lenghts and complexity. Second, the program loads the data into two data types used in data science and application development, a Pandas dataframe and python dictionary.

Usage

python run.py

Parameters

Nest depth parameters contain min, max, and stepsize for how deep an object to create.

Records Per Nest Level Parameters contain min, max, and stepsize for how wide each level should be.

Next Steps

Refactoring the code to run faster is important. Bottleneck exploration and potentially using stored data to generate the files instead of the faker library might be useful.

Analyzing the data: Data collection is underway! If you'd like to contribute, please let me know.

As a part of analyzing output, ensuring that the program is working as expected.

Contact me on Twitter @vincebrandon or through my site at vincebrand.com

You can’t perform that action at this time.