# Lab 6a
## The Final Lab!

I know it seems like only yesterday we were first writing "import arcpy," but here we are. You've learned a lot along the way and now, in this final lab, I'm going to ask you to put it all together, try some new things, and *have fun* with your new skills. The lab comes in two parts 6a and 6b, which you'll get next week, and is due **March 1st** (so two weeks *after* you get part b).

In part a, we'll be playing with APIs - namely the ArcGIS API and Twitter's Streaming API. We use Esri's because it is the direction they are taking their product and Twitter's because it is well established and well-documented. *Their use here is not an endorsement or signal of the meaning to be found in tweets or the mapping thereof.*

### First, let's set up our environment

Ultimately, you are going to make some interactive web maps of tweets for me. However, the exact means by which you do it is left up to you (i.e. will you use Folium or the ArcGIS API, perhaps geopandas to handle the back end, etc.). As such, I'm going to recommend you create an environment with the following libraries, but you may not end up using them all:

`conda create -n lab6 python=3.6`

After you've created and activated that environment, install the following:
`conda install -c esri arcgis`

`conda install -c conda-forge geopandas jupyter folium fiona tweepy geopy`

You should recognize most of those libraries, but the last two are new. I am using [tweepy](http://www.tweepy.org/) to access the Twitter API, there are lots of other libraries. A popular one you might use is [TwitterSearch](https://pypi.python.org/pypi/TwitterSearch/). Additionally, [geopy](https://pypi.python.org/pypi/TwitterSearch/) is a geocoding library for python. Again, lots of these exist and you are free to find one that works best for you; however, your examples will be using these libraries.

### Now, let's get the required authorizations to use the Twitter API

First, you need a Twitter account. You don't have to ever use it outside of this class and I am *definitively not endorsing Twitter in any way*. They have a free, well-documented API that you can use, that's it.

Once you have an account, [go here](https://apps.twitter.com/) and create a new application. Fill in the appropriate information, for a 'website' link to your github account. You can ignore the Callback URL for this use.

After you have created the application, page over to your **Keys and Access Tokens**. You are going to need to find: A Consumer Key, a Consumer Secret, an Access Token, and an Access Token Secret.

Your Consumer Key and your Consumer Secret are listed in your **Keys and Access Tokens** section. You can also **create your access token** there as well.

These four codes are a way for Twitter to keep track of who is accessing their API, when, and to do what.

#### Let's see if this worked.


In [6]:
import tweepy

CK = 'BFODcpwYfmOp6Ohyx5vfHmaD3'
CS = 'gyKLznqzhQLGEoolbec0q6uPOkY04ML6qGD4JQDnKbJfbythW4'
AK = '80691476-8ise2zNzExPPn9yBCXP3h2t7lQWsoG2b3LCJK6II5'
AS = 'ru6INJzJhSh3kvonVIku20B8lU3kGYIl4OPywCeiLC95v'

auth = tweepy.OAuthHandler(CK, CS)
auth.set_access_token(AK, AS)

api = tweepy.API(auth)

print(api.user_timeline(id='uwtacoma', count=1)) #This simply pulls the last tweet from an account

[Status(_api=<tweepy.api.API object at 0x11301e400>, _json={'created_at': 'Thu Feb 08 16:56:39 +0000 2018', 'id': 961644960331124736, 'id_str': '961644960331124736', 'text': "UW Tacoma is pleased to announce @ColumbiaBankNW is the recipient of this year's Corporate Gold Star Award. Columbi… https://t.co/fjNwamtW3B", 'truncated': True, 'entities': {'hashtags': [], 'symbols': [], 'user_mentions': [{'screen_name': 'ColumbiaBankNW', 'name': 'Columbia Bank', 'id': 139475570, 'id_str': '139475570', 'indices': [33, 48]}], 'urls': [{'url': 'https://t.co/fjNwamtW3B', 'expanded_url': 'https://twitter.com/i/web/status/961644960331124736', 'display_url': 'twitter.com/i/web/status/9…', 'indices': [117, 140]}]}, 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'user': {'id': 27892153, 'id_str': '27892153', 'nam

### If that ran successfully, you should have a giant mess of text.

That's the data that accompanies a single tweet. Interesting, huh? Check out the [reference docs](http://tweepy.readthedocs.io/en/v3.5.0/api.html#) for tweepy and spend some time experimenting if you want.

Here, I'll pull the same tweet as above, but this time I'm **only** going to print out the text property of the Status object and then check for some location information.



In [4]:
#Note: First I specify which list object I want, then I pull a property from it.

print(api.user_timeline(id='uwtacoma', count=1)[0].text)

#Now, let's see if there's some lat and long associated with the tweet
print(api.user_timeline(id='uwtacoma', count=1)[0].geo)
print(api.user_timeline(id='uwtacoma', count=1)[0].coordinates)

UW Tacoma is pleased to announce @ColumbiaBankNW is the recipient of this year's Corporate Gold Star Award. Columbi… https://t.co/fjNwamtW3B
None
None


### (Un)fortunately, most tweets don't actually have location information associated with them. 

There's been *a lot* written about this and the numbers vary from under 5% to 20% or so of tweets. Additionally, it's been argued that upwards of 60% of tweets *can* have some location inferred due to language use, topic, etc.

That's all interesting (and please do email me for citations if you so desire); however, it's also kind of besides the point here. We're interested in learning how to interact with APIs and process data, we can argue about the ephemerality of said data another day.

Let's query some topic of interest and see if we can find some spatial data.

Now, it's important to note that Twitter has both a search and a streaming API. So far, we've been using the search API - we search for existing tweets. I am now going to switch to the streaming one; **however**, lab6scratch.py has an example of how to step through search API results.


In [1]:
import tweepy

#We're going to set up a couple of tricks here

CK = 'BFODcpwYfmOp6Ohyx5vfHmaD3'
CS = 'gyKLznqzhQLGEoolbec0q6uPOkY04ML6qGD4JQDnKbJfbythW4'
AK = '80691476-8ise2zNzExPPn9yBCXP3h2t7lQWsoG2b3LCJK6II5'
AS = 'ru6INJzJhSh3kvonVIku20B8lU3kGYIl4OPywCeiLC95v'

auth = tweepy.OAuthHandler(CK, CS)
auth.set_access_token(AK, AS)

#By setting these values to true, our code will automatically wait as it hits its limits
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

#Now I'm going to set up a stream listener
class CustomStreamListener(tweepy.StreamListener):
    def on_status(self, status):
        text = status.text
        user = status.author.screen_name
        print('%s tweeted %s' % (user, text))
    
print ("Running")
while True:
    try:
        stream = tweepy.Stream(auth=api.auth, listener=CustomStreamListener())
        #This next line puts a bounding box roughly around Seattle/Tacoma.
        #You start in the southwest and then go to the northeast
        #The format is longitude, then latitude... cuz Twitter
        stream.filter(locations=[-76.130, 45.068, -75.438, 45.593])
    except Exception as e:
        print(e)
        print('Trying to continue')
        continue
        

Running
Janet_AM tweeted @bobrage1 @Lakeler @jenny_nq Ain't that the truth 🤣
inaemichele_ tweeted Cara eu amo o Whindersson, kkkkkkkkkk q cara foda https://t.co/eZ7Op9xzJJ
rogergumley tweeted @bloodless_coup You are correct. I have the attention span of an indoor palm watching TV at night (too many monitor… https://t.co/ZnqAj8vSlz
officialadizzle tweeted Yo wtf metro Boomin is actually soooooo foiiiineeeeeee
montrealdesign tweeted Drouin has to realize that hockey is more than just PP opportunities.
#Habs
@GagnonFrancois
Thomas_Duncan tweeted @SimonDingleyCBC @Jenny2Hugs @Q107Toronto Great place to take shelter from the coming onslaught. I'm sure they'll r… https://t.co/f6OZbFsTZ8
BarbieGirl13xo tweeted Winner winner chicken dinner! 🙌🏻🎉🎲🥇
Thomas_Duncan tweeted @SimonDingleyCBC @Jenny2Hugs @Q107Toronto *kinds
kaaateorade tweeted @kreitsn I LOVE YOU TOO BBY 💞💞


KeyboardInterrupt: 

### If that all worked, you now have a listener that will pull tweets from a bounded area you define.

**Cool**. Well, I think so. But, even though we're now pulling tweets *from* a location, you aren't saving their spatial data... *quite yet*.

That's where the lab actually begins.

### Question 1: Where the tweets at?

Using the example code above **and** what you'll find in the lab 6 scratchpad, start pulling the spatial information from the tweets in question. Create a 'file' that contains a tweet's author (account name), its text, and the location from which it came (in latitude and longitude). This 'file' can be in a number of formats (geojson, txt, csv, etc.). 

Bear in mind, there are *a few* ways you can pull location information. You can find the [twitter api documentation here](https://developer.twitter.com/en/docs/tutorials/filtering-tweets-by-location).

Some tweets will come from a 'location' that is a named place. In order to handle those, you will need to geocode the information. The function below takes a string and returns latitude and longitude. Start there.

In [9]:
from geopy import geocoders
import tweepy
import os
import csv
import json

#If there is an existing output file, remove it
try :
    os.remove('streamout.csv')
except OSError:
    pass

#Returns lat&long from a given location string
#Usage Example: print(geo('1900 Commerce St, Tacoma, WA')) RETURNS: (47.2452, -122.4427)
def geoCode(location):
    try :
        g = geocoders.Nominatim()
        loc = g.geocode(location)
        return [loc.latitude, loc.longitude]
    except : 
        print("Error parsing location!")
        return [0,0]

#Set up Tweepy
CK = 'BFODcpwYfmOp6Ohyx5vfHmaD3'
CS = 'gyKLznqzhQLGEoolbec0q6uPOkY04ML6qGD4JQDnKbJfbythW4'
AK = '80691476-8ise2zNzExPPn9yBCXP3h2t7lQWsoG2b3LCJK6II5'
AS = 'ru6INJzJhSh3kvonVIku20B8lU3kGYIl4OPywCeiLC95v'
auth = tweepy.OAuthHandler(CK, CS)
auth.set_access_token(AK, AS)
twitterApi = tweepy.API(auth)

#Writes tweet info to a CSV
def writeToFile(user, text, lat, long) :
    myFile = open('streamout.csv', 'a')
    fileWriter = csv.writer(myFile)
    fileWriter.writerow([user, text, lat, long])
    myFile.close()

#A stream listener
#If parses coordinates or place coordinates, passes them to a CSV writer
class CustomStreamListener(tweepy.StreamListener):
    def on_data(self, data):
        all_data = json.loads(data)
        user = all_data['user']['screen_name']
        text = all_data['text']
        print(text)
        if all_data['coordinates'] :
            long = all_data['coordinates']['coordinates'][0]
            lat = all_data['coordinates']['coordinates'][1]
            writeToFile(user, text, lat, long)
        elif(all_data['place']) :
            coords = geoCode(all_data['place']['full_name'])
            writeToFile(user, text, coords[0], coords[1])
        
#MAIN
#Run a stream using the stream listener I created earlier
while True:
    try:
        stream = tweepy.Stream(auth=twitterApi.auth, listener=CustomStreamListener())
        #Filter by Southern Ontario
        stream.filter(locations=[-82.271, 42.344, -78.525, 44.290])
    except Exception as e:
        print(e)
        print('Trying to continue')
        continue


@DragonsRFC @UPRIGHTRUGBY @yorkulions Great group of athletes! Thanks for having me!
@weedmapsca It’s actually the second @JWCmed first
@ericgarcia90 @sarahjanehuff Me, sitting in your driveway for 20 min before your mom comes out to knock on my windo… https://t.co/BN3Oskjquq
@TheCristianoWay  https://t.co/jqZ2n6eWko
Ayo chill https://t.co/AaSTNIuR8Y
To Better Understand Segregation, Look at Social Networks https://t.co/SdyzNXQbHP
Driving an hour and half to birthday party in the snow
@adamwillsdev ‘Our turn-over rate is 18months’
Well I survived Walmart....barely....and they took all my money https://t.co/1xlJaFdJ6p
@ramzymd If Abouna stood at the pulpit, and said "I'm a Buddhist, now", wouldn't he remain Coptic by ancestry?
@SeanFitz_Gerald If I had grown up in Calgary I’d either be an Olympian or paralyzed by now. I did a lot of dumb th… https://t.co/NCmDvSeVvq
Let's connect everyone in the world to bring the world a better place ! Support everyone's… https://t.co/XMCo8tdlxL
@rhpssc

KeyboardInterrupt: 

### Question 2: Tweets on a map.

Now that you have a 'file' (or a script that will extract author, text, and location from tweets), let's make a map.

Using Folium, ArcGIS API for Python, GeoPandas, or Arcpy, create a map from your file. Make sure you accumulate enough tweets (let's say 100 or so) before you create the map.

Next week, we'll get into how to update the map on the fly and make it more interactive; for now, just make sure you can query some tweets, parse the data, put that data into a GIS of some form.

### Bonus: Did I say you could use any 'GIS,' how about all of them?

Create a script that allows the user to specify in what format he would like the resulting map (ArcGIS API, Shapefile, GeoPandas, or Folium) +1 point for each additional format (**total possible bonus: +3**)