View Live

Heatmap Code Challenge

It was a pleasure and also quite a challenge to work on this project. Below I have listed the issues that arose in the order that they occurred, what I tried, and how I solved them. Further down is my project steps outline.

ISSUES that arose:

CSV

Problem: The CSV file was too large for GitHub.

Solution: I added it to my .gitignore file, and used git reset --soft HEAD^ and git reset HEAD <heatmap/.GeoLite2-City-CSV_20190312/.GeoLite2-City-Blocks-IPv4.csv to go back to prior commits and remove the file from staging.
HEROKU ERROR

Problem: I got an error code when trying to launch the Heroku site: code=H10 desc="App crashed", and scrolling up the traceback I found: ModuleNotFoundError: No module named 'geodata-django'. I did a project-wide search for 'geodata-django' and found that I had entered it in the Procfile as web: gunicorn geodata-django.wsgi.

Tried:
1. I replaced 'geodata-django' with 'GeoData' but got the same message: No module named 'geodata-django'. Ultimately this step was partly the answer.
2. I reviewed Heroku set-up and tried $ heroku ps:scale web=1 to ensure that at least one instance of the app was running. I got a positive response Scaling dynos... done, now running web at 1:Free.
3. I restarted Heroku with heroku restart.
4. I connected a psql session with my remote database: $ heroku pg:psql output (abbr.)--> Connecting to gresql-polished-87072 psql blooming-journey-52100::DATABASE=>
5. I tried writing to the Procfile using echo "web: python app.py" > Procfile in the command line. This was a cool trick that I'm glad I got to try, but unfortunately got the same result. (https://stackoverflow.com/questions/15790691/procfile-not-found-heroku-python-app)
6. From (https://stackoverflow.com/questions/29481506/heroku-procfile-not-working) I tried $ heroku run bash $ cat Procfile output --> web: gunicorn geodata-django.wsgi and no module named geodata-django. I entered: $ web: gunicorn geodata.wsgi (Progress! I got a no module named geodata error.) I repeated these steps with web: gunicorn GeoData.wsgi (New error! ModuleNotFoundError: No module named 'GeoData.heroku_settings) And that's right, there was no module named that at the time (it was temporarily commented out).
7. I un-commented heroku_settings.py', pushed to git and to Heroku. I got an error while running $ python manage.py collectstatic --noinput. I tried adding a CSS file under 'static/'.
Solution: It needed to be: web: gunicorn GeoData.wsgi, but I had neglected to push to Heroku, and fix a few other things such as un-commenting heroku_settings.py and adding my requirements.txt.

Lessons Learned:

I should make all apps, projects, and files lower case to reduce the chance of this type of error. I am now well-versed in the Heroku deployment. In the past I had done no more than one deployment per project; I did not have experience with doing it frequently. I was following a guide and didn't have the process memorized. But I do now!
DATAFRAME VALUE ERROR

Problem: In load_geodata.py, print(df) works, but return(df) gets a ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Tried:
```
 1. `df = df[['latitude', 'longitude']]
 df.head()` -- no errors
 `print(df.head())` -- outputs top 5 rows
   
     latitude  longitude
 0  -35.5016   138.7819
 1   24.4798   118.0819
 2   24.4798   118.0819
 3  -33.4940   143.2104
 4   23.1167   113.2500

 2. `print(df.all())` -- outputs 
 latitude     False
 longitude    False
 dtype: bool

 3. `print(df.bool())` -- outputs
 `ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().`

 4. `print(df.any())` -- outputs
 latitude     True
 longitude    True
 dtype: bool
```
Solution:

I consulted with Clinton from Momentum, who thought that pandas would be problematic for the future steps in this challenge. He suggested I use the simpler CSV Reader that I had previously rejected. Specifically, the DictReader. So I scrapped pandas. I spent many hours working with pandas in this project. Goodbye beautiful code! I'll never forget you!

In memoriam:

`import pandas as pd import io

def handle(self, *args, **kwargs): # Truncated_data test: # truncated_data = ''' # network,geoname_id,registered_country_geoname_id,represented_country_geoname_id,is_anonymous_proxy,is_satellite_provider,postal_code,latitude,longitude,accuracy_radius # 1.0.0.0/24,2070667,2077456,,0,0,5214,-35.5016,138.7819,100 # 1.0.1.0/24,1811017,1814991,,0,0,,24.4798,118.0819,50 # ''' # df = pd.read_csv(io.StringIO(truncated_data), usecols=["latitude", "longitude"])
```
 # Parses from entire CSV file:
 df = pd.read_csv(("heatmap/.GeoLite2-City-CSV_20190312/.GeoLite2-City-Blocks-IPv4.csv"), usecols=["latitude", "longitude"])
 df = df[['latitude', 'longitude']]
 df.head() # just top 5 rows`
```
SAVE LAT AND LONGS FROM CSV

Problem:

I needed to figure out how to get the latitude and longitude data saved through the LatLong Model, so it could then be serialized and written to a JSON (or GeoJSON) API endpoint.

Solution:

I used the CSV DictReader to isolate lat and long, and create model objects. Then used the ModelSerializer class to manipulate the serialization, and send it through the ListAPIView to return the GeoJSON data in the API endpoint.
DELETE LARGE NUMBER OF TEST OBJECTS

Problem:

109k+ model objects were created during testing, that needed to be deleted before creating from the full CSV file of 3 million + objects.

Solution:

I commented out the code in the handle management command function, and instead ran LatLong.objects.all().delete().
VALIDATION ERROR

Problem:

When running my management command to pull out latitudes and longitudes from the CSV and create objects, I got the following error:

.../python3.7/site-packages/django/db/models/fields/__init__.py", line 1559, in to_python params={'value': value}, django.core.exceptions.ValidationError: ["'' value must be a decimal number."]

Tried:

I looked through the CSV, and saw that some of the lat/long values were whole numbers. I reasearched to see if that could throw the error, and it didn't look like it should, because the decimal field of the LatLong model turns whole numbers into numbers with zeros after the decimal point. I revisited my choice to choose the decimal field over the float field. I decided that decimal field should be fine.

Solution:

To get around the error, I added this to my function:

try: LatLong.objects.create(latitude=row['latitude'], longitude=row['longitude']) except ValidationError: pass
ACCIDENTAL LARGE FILE PUSH

Problem:

I accidentally pushed my data.dump file, created to attempt to populate the database on Heroku with the database created locally, onto Git.

Tried:

I deleted the file from the project and from Git, but Git became frozen in a loop with the file stuck in limbo somehow.

Solution:

I followed the steps in the link below:

(https://stackoverflow.com/questions/19573031/cant-push-to-github-because-of-large-file-which-i-already-deleted)
SERIALIZE MODEL OBJECTS INTO GEOJSON

Problem:

Even though hardcoded test points written in GeoJSON format rendered on the heatmap layer,and even though I wrote a serializer that rendered identical looking GeoJSON, the console showed an error that said it was not valid GeoJSON.

Solution:

I researched, tweaked, and repeatedly tested the map.on('load', function() in my map.js, the serializer, and the view (and the type of view), and ultimately what worked was the def list method added to the ListAPIView.
HEROKU DATABASE

Problem:

I followed the steps here to dump my local database into a file, commit to Git, and push onto Heroku, and then pull back from Git without pushing it (because it's way too large). I never got an error, but it didn't populate my Heroku database.

Tried:

I repeated the steps, and confirmed it still didn't work.

Solution:

I used the same process, but with the CSV file. And then on Heroku I used the management command to read the CSV, pull out the latitude and longitude data, and create objects.

Problem 2:

The command above ran for around half an hour, and about 10 minutes in Heroku emailed me that I had run over my 10,000 row limit, and that in 7 days they would have to revoke my "insert" privileges. When I stopped the command program because it was no longer adding to Heroku, I had used around 145,000 rows. I did notice that over time, that number was decreasing (around 139,000 one hour later).

Tried:

I considered paying for an upgrade plan, but that would involve recreating the database. Since it was the final hour, I didn't want to chance taking a working (but limited) product down and re-running it.

Solution:

The program is working, the heat layer loads, but it only has around 145,000 objects in my models, instead of over three million. In order to solve this problem, I would have needed to compress the data significantly somehow.

STEPS:

Set up Project

Create:
- repo on GitHub
- Django project
- html templates
- static files: CSS JavaScript (finish at the end)
- urls
- model(s), make migrations
- admin class for each model
- views for index page
Research how to access CSV file in Django

Write the python script

Test in Python shell

Create a management command to load CSV file

Test parsing lat/long data from sample (top three lines) of CSV

Parse data from entire IPv4 file
```
 Decide between these two methods (initially chose pandas but later switched to CSV Reader)

     CSV module: (https://docs.python.org/3/library/csv.html) 

     Parse specific columns from CSV file 
     (https://stackoverflow.com/questions/16503560/read-specific-columns-from-a-csv-file-with-csv-module)
```
Write code to return list of coordinates that can be used for JSON

Create model objects (25 minutes to load!)
Research MapBox

Must be JSON from the API to satisfy assignment requirements

Research using MapBox, MapBox gl JS, Leaflet, and Leaflet-heat to draw geographical data on a map in the browser

Convert JSON to GeoJSON in this format:
```
     `{
     "type": "Feature",
     "geometry": {
         "type": "Point",
         "coordinates": [125.6, 10.1]
     }
     }`
```
Coordinates in this order: long, lat

Bind in geo bounding box using MapBox
Django REST Framework buildout

Research endpoint requirements for heatmap

Define a REST endpoint that returns a list of coordinates within a geographic coordinate bounding box

url for api in heatmap.urls

api app:
```
 views 

 serializers 

 (https://www.django-rest-framework.org/api-guide/fields/#decimalfield)

 urls 
```
Deploy to Heroku

Create requirements.txt for dependencies

Install Heroku

(https://blooming-journey-52100.herokuapp.com/)

Finishing Steps:

    1. Follow steps from (https://jaketrent.com/post/django-loaddata-heroku/) to dump data from database onto Heroku (bypasses Git in a way). Remove .json file from project.

    2. Serialize model objects to output GeoJSON; store them in an api endpoint

        serializers.py
        api/views.py
        api/urls.py

    3. Push to Heroku

    4. Add api endpoint to map.addSource; check that it adds points on heat layer of map.

    5. Check that requirements.txt is up to date

    6. Push to Git

    7. Push to heroku, limiting queryset to 1000 so the server connection will not time out. Continue to raise queryset and test

    8. Submit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GeoData

GeoData

api

api

heatmap

heatmap

.gitignore

.gitignore

Pipfile

Pipfile

Pipfile.lock

Pipfile.lock

Procfile

Procfile

README.md

README.md

manage.py

manage.py

requirements.txt

requirements.txt

Repository files navigation

View Live

Heatmap Code Challenge

ISSUES that arose:

STEPS:

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
GeoData		GeoData
api		api
heatmap		heatmap
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
Procfile		Procfile
README.md		README.md
manage.py		manage.py
requirements.txt		requirements.txt

Ardrasal/GeoData-Django

Folders and files

Latest commit

History

Repository files navigation

View Live

Heatmap Code Challenge

ISSUES that arose:

STEPS:

About

Resources

Stars

Watchers

Forks

Languages