Two Python applications: building a cohort from simulated data and counting venues in cities using Foursquare API. A report in pdf is included in "exercise-1"
- Building a cohort of Signup's vs first order's. Folder: "exercise-1".
- The number of laundry, hairdressing and fitness stores per 100k habitants. Folder: "exercise-2".
The folder exercise-1 contains the Python code library_simulate_data.py to generate a population of user ID's and signup timestamps:
User ID | sign_up_timestamp |
---|---|
0102V | 2018/04/02 |
... | ... |
User ID | order_timestamp |
---|---|
2313Q | 2019/01/13 |
... | ... |
The goal is to construct from the former two tables a cohort counting: from the users that signed up in week N, the users that made the first order in week N + k. Something like this:
Week | number | N+0 [%] | N+1 [%] | N+2 [%] | ... |
---|---|---|---|---|---|
Week 1 | 58 | 45 | 25 | 7 | ... |
Week 2 | 12 | 34 | 23 | 9 | ... |
... | ... | ... | ... | ... | ... |
In this simple script I use Foursquare API requests to obtain the number of laundry, hairdressing and fitness stores per 100k habitants in two cities of France and Germany. It consists on two scripts contained in folder exercise-2: gymsLaundryBeaty_foursquareAPI.py and tableComparison.py . The output table from the latter script is in latex format. The results obtained from the API requests are stored in .csv files, and extreme benefit from the package foursquare_api_tools is warmly acknowledged (see foursquare_api_tools).