Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Lessons Learned

Larisa Ciupe edited this page Jul 28, 2020 · 29 revisions

Here, the learnings of this projects and suggested improvements can be found

Table of Contents

1. General

1.1. Overall

image

1.2. Time Distribution

Overall, making the dash views and the application deployable took quite some time. image The other points were also challenging but the time needed was more predictable and therefore easier to manage. One solution could have been to use a ready-to-go Dashboard solution such as Grafana or Kibana. However, due to the larger scope of the projects and the five team members, using this more complex, but flexible solution was the better choice.

1.3. Dash

Overall, the team was very satisfied with Dash. First, writing everything in Python enabled the team to leverage their previous knowledge. Second, Dash had all the functionality needed to develop the dashboard.

On the other hand, there were also some negative points. First, the documentation and community was good, but not great. Second, the dashboard needed to be developed from scratch (there were no ready to use components). Third, implementing dynamic changes were quite time consuming and difficult to understand. They were implemented using so called Dash callbacks. These are functions that automatically update a certain content, if the input component's property changes.

2. Personal Learnings

2.1. Larisa

  • Dashboard creation using a new programming language (Python), different packages (Pandas, Plotly, usw.) and the Dash framework.
  • Adding business value using data science and predictions.
  • Adding interactivity to the dashboard using callbacks (e.g. for filtering graphs or adding visual effects).
  • Make use of open communities (e.g. Dash and Plotly).
  • Adjust expectations and adapt to the limitations of the used technologies and resources.

2.2. Jakob

  • Working with Python and different libraries/frameworks e.g. Pandas, Dash, Plotly.
  • How to connect backend and frontend.
  • How to create interactivity using callbacks.

2.3. Lucie

Just as written above, it also applies here:

  • Working with Python and different libraries/frameworks e.g. Pandas, Dash, Plotly.
  • How to connect backend and frontend.
  • How to create interactivity using callbacks.

Apart from that:

  • creating differnt charts and graphs including calculations
  • working with GitHub as agile team using Scrum, Kanban board and Wiki
  • understanding the business value of defined target group (fleet manager and decision owner)
  • working efficiently remote as a team

2.4. Tim

  • Deeper inside into python libraries such as pandas, numpy and scikit-learn
  • First time training a machine learning algorithm
    • How do you prepare your data before training?
    • How can you test the fit of your model?
    • How to handle categorical variables when trying machine learning?
    • Understanding of gradient boosting
  • First time deploying an application on google cloud
    • Learned a lot about the use of the console and about the system architecture
  • Got better at handling multiple dependencies within one project
  • Found a lot of new useful features within pycharm and optimized my workflows
  • Things to look out for when selecting a dataset
    • Would have been easier with a dataset where one row does not represent a completed journey.
  • Improved my GitHub workflows

2.5. Johannes

  • Conducing the user research
  • Creating design concept (design mockup)
  • Researching dashboard solutions and data pipelines (batch vs streaming processing)
  • Setting up the basic pages with the routing using Python (was new for me)
  • Deploying the Dash application (Flask backend) on the Google Cloud Compute Engine using the Gunicorn Webserver
  • Styling the dashboard with Bootstrap and Flexbox
  • Project and documentation management
  • Further learnings are written below:

2.5.1 Deploying Dash Application

This project's Dash application can be deployed on the Google Cloud Compute Engine as follows:

  1. Create a Google Cloud account and setup VM instance on Google Cloud

Smallest machine configuration (f1), leave boot disk on standard (Debian), Access scope (Allow full access to all Cloud APIs), Firewall (allow HTTP and HTTPS)

  1. Connect to the instance via the Cloud Shell oder Terminal

gcloud beta compute ssh --zone "europe-west3-c" "fleetboard-database" --project "abiding-orb-278309"

  1. Installing virtualenv and python (https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)

sudo apt install python3.7 python3-dev python3-venv python3-pip libffi-dev libssl-dev git

  1. Cloning git repository

git clone https://github.com/Fleet-Analytics-Dashboard/Application.git

  1. Going into Application folder

cd Application/

  1. Creating a virtual environment for installing dependencies just for this application

python3 -m venv env

  1. Activating the virtual environment

source env/bin/activate

(8. Install dependencies -> step can be skiped, because dependency problems (9. and 10.) needs to be solved before)

pip3 install -r requirements.txt

  1. Installing libpq-dev for fixing "pg_config executable not found"

sudo apt-get install libpq-dev

  1. Installing wheel for installing Dash "Failed building wheel" (https://stackoverflow.com/questions/53204916/what-is-the-meaning-of-failed-building-wheel-for-x-in-pip-install)

pip3 install wheel

  1. Install fixed dependencies

pip3 install -r requirements.txt

  1. Run gunicorn web server

gunicorn -b 0.0.0.0:8080 main:server

  1. Go to the website

2.5.2. Dash app run issue "address already in use"

Changing branches may lead to address already in use issue. After changing a branch, a local Python application does not terminate the previous application. Therefore, it must be terminated (OSError: [Errno 48] Address already in use). The solution is as follows

  1. Open a terminal
  2. ps -fA | grep python (lists all running Python applications with its respective number)
  3. kill (kills the respective application)
  4. Now the app can again run on the local standard port (it was freed up)

2.5.3. Fix issues with pscopg2 on Mac OS

Fix unable to install pscopg2 database connector on Mac OS as follows

  1. Change pscopg2 to pscopg2-binary in the requirements.txt
  2. Install per pip3 install -r requirements.txt again and done

2.5.4. Fix problems with wheel packages

Solving the problems with wheel packages which has the following error output: from psycopg2._psycopg import ( # noqa
ImportError: dlopen(/Users/luciejuergens/PycharmProjects/untitled2/venv/lib/python3.7/site-packages/psycopg2/_psycopg.cpython-37m-darwin.so, 2): Library not loaded: @rpath/libssl.1.1.dylib
 Referenced from: /Users/luciejuergens/PycharmProjects/untitled2/venv/lib/python3.7/site-packages/psycopg2/_psycopg.cpython-37m-darwin.so
 Reason: image not found


  1. pip3 uninstall psycopg2
  2. pip install --pre -i https://testpypi.python.org/simple psycopg2

Resource: https://www.postgresql.org/message-id/CA%2Bmi_8bd6kJHLTGkuyHSnqcgDrJ1uHgQWvXCKQFD3tPQBUa2Bw%40mail.gmail.com

2.5.5. Fix enum error on Mac

When executing the main.py, this error occures:

AttributeError: module 'enum' has no attribute 'IntFlag'?

This can be fixed by deactivating "Activate Google App Engine Support" in the preferences of Pycharm

Resource: stackoverflow - App Engine Problem

2.5.6. GitHub Branch Naming

Branch naming

  • Small letters
  • Only dashes
  • No whitespaces
  • No special characters (e.g. #,%,&,...)
  • Examples: 14-setup-dash-mvp