Hey all, Sam here, apologies for the following diatribe. I'm a full stack software engineer doing graduate work in mathematics. Specifically mathematical modeling. As a software engineer navigating the data science space, I have noticed that several of the approaches don't take advantage of current technologies and instead recreate the wheel. Namely the lack of database usage, structuring the project as a collection of scripts instead of as an application (api), not having a clear way to interact with the trained results, and not leveraging containerized environments.
This is just a containerized implementation of data science development in jupyter notebook leveraging neo4j as a database.
- Make sure that docker is installed on your local machine.
- Download this project
- Open a terminal and cd into this directory
- Run
docker-compose up -d
- Open a browers and see your jupyter notebook at database interfaces
In the project directory, you can run:
Runs the app in the development mode with with logs output to the terminal
The neo4j interface can be viewed in a local browser Open http://localhost:7474/browser/ to view it in the browser.
The jupyter notebook has all of the python packages as well as NVIDIA and CUDA drivers. Open http://localhost:8888/ to view the notebook in the browser.
To connect with the database use the container name instead of localhost. To Shut this down either close the terminal or press ctrl c
This does the same as the previous command but it runs the app independant of your terminal being open.
The services will run in the background until you cd into this directory and run 'docker-compose down'
If your containered services are not behaving the way that you would expect run this command to rebuild all the containers. Then when you run docker-compose up it will use these fresh docker images.