# Lab 5 - Exploring Azure

## Welcome

This is it. This is our final lab together and one that is entirely unique in terms of approach and expectations. I'll explain all of that in a moment, but first I wanted to congratulate all of you.

#### Well done.

This course - _this quarter_ - is a difficult one and all of you have put in the time and effort to get through it. For some of you, python might have clicked immediately; for others, it can take time... _it can still be happening_. But, just like with writing, the more you do it, the better you'll become.

As I've stressed repeatedly, rather than memorizing every piece of syntax, make sure you focus on **how** computation works. The core ideas of flow control, iteration, abstraction... functions, classes, and methods. These ideas aren't unique to python and if you can grasp them; then, you're able to work in almost any programming language. **Yes**, there will be hiccups, but if you can _think computationally_ if you can break apart a big problem into the steps that a computer can follow - then you can accomplish amazing things.

#### Which brings us back to this lab.

Unlike in your other labs, this lab is much more like a task you'd face at a job. Yes, the outcome is a bit silly, but the *linking together of tools in the cloud through a programming language* is **exactly** the skills you'll be expected to have. What you're learning here really is a skillset that's at the forefront of how geospatial technologies are moving. We're going to be working directly with REST APIs and virtual machines, intergrating our work flow into the computational power available through cloud services.

But, that opportunity comes with a cost. Unlike in our other labs, things will break, *they might not go as planned*. In the past, a unique error might pop up (as we are all running complicated systems that leverage lots of different libraries), but fundamentally I knew what the solution _should be_ every time. It was 'canned' in a sense; that's not what's happening here. In this case, you're working with real data and live systems. 

It's a risk, but it comes with an awfully big reward.
#### So what does that mean?
Well, it means that grade wise you are all going to do fine. Let's just get that out of the way. It means that we have to trust one another and help one another through this process.

And, with all of that in mind, let's begin.

## Getting set up.

As always, you're going to want to build your environment.
I would recommend you use python 3.7 and that you install jupyter, folium, and geopandas. I used the `-c conda-forge` to install all three at once, but do what works for you.

In addition, you're going to need to install the azure services we're going to be using. 
`pip install azure-cognitiveservices-search-newssearch`

You'll also have access to their maps service and their sentiment analysis (although you could also use one of our other mapping, geocoding, or sentiment analysis services instead - such as folium, geopy, or NLTK).

## Our goal and some guidance

Here's where things get tricky. Unlike in other labs, I don't have a set of questions for you to work through. Instead, I'm going to give you an end goal and then some tips on how you might approach it. I'll also be constantly visiting each group to help you generate ideas, goals, and methods.

For what it's worth, this is how your Cartography seminar will work, so think of this as a bit of a taste of what's to come. **Have fun**. Be bold and be creative. 

### Your task: Create a map of how people are talking about a place
**Wait, what?**

What I want you to do is create a map that visualizes how a place is being talked about in the news, on social media, etc.

**Wait, how?**

Well, take a look at [this quickstart guide for sentiment analysis](https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/quickstarts/python). You can find another [tutorial here](https://www.pingshiuanchua.com/blog/post/simple-sentiment-analysis-python). That second one uses a few approaches, but I recommend you stick with azure or NLTK. 

Sentiment analysis gives you a very rough, machine-learning createsd sense of the 'emotion' (positive or negative) in a given set of texts. It's extremely useful, for example, if you are trying to monitor how people are talking about your business(es) online. 

Then take a look at [this guide on conducting news searches](https://docs.microsoft.com/en-us/azure/cognitive-services/bing-news-search/news-sdk-python-quickstart). (note: you've already set up your environment as they do at the beginning). 

You can also read about azure maps' geocoding service [here](https://docs.microsoft.com/en-us/rest/api/maps/search). You have access to this service (see the passkeys below).

But, remember, you have other sources of information as well! You could - for example - monitor twitter for tweets about a place (talk to me if you'd like to do this); you could use the geopy geocoder (you've seen examples of this in my code, you can read more about it [here](https://geopy.readthedocs.io/en/stable/); you could - being careful with your credits - use the Arc GIS API geocoder or other services.

**At this point, you have a lot of tools open to you**. Talk with your group, talk with me, come up with a plan.

#### Your final output must have:
    1. At least 250 entries (so 250 news stories, tweets, instagram posts, whatever).
    2. Some indication of the 'sentiment' of the entry (it can be as simple as 'blue for positive, red for negative' push pins; obviously, more interesting results are better)
    

### A sample workflow

If you're stuck, you can follow along with me as I biuld something during the lab.

Here's the workflow I'll follow:
1. Conduct a news search about an area (probably Tacoma, but if that doesn't get enough results somewhere else).
2. Run the results of that search through the sentiment analysis
3. Map it
        a. First, just simply putting points on a map with the color related to the sentiment.
        b. Then, getting fancier. I'll try to create a heat map and then even normalize that heat map based on sentiment or population or something. Who knows where we'll end up, it's exciting to poke at code.
        
        
You can find the keys for the existing services in a password protected file [here](https://github.com/UWTMGIS/GIS501_w19_files/blob/master/w19_azurekeys.pdf.zip). I will provide the password in class.
    

# Let's go!
## Get fancy, get creative, have fun.

In [5]:
#http://docs.tweepy.org/en/latest/
#https://wafawaheedas.gitbooks.io/twitter-sentiment-analysis-visualization-tutorial/sentiment-analysis-using-textblob.html
#https://stackoverflow.com/questions/41058798/python-geopy-error-handling
#https://www.earthdatascience.org/courses/earth-analytics-python/using-apis-natural-language-processing-twitter/analyze-tweet-sentiments-in-python/

from geopy import geocoders
from geopy.exc import GeocoderTimedOut
import tweepy
from tweepy import OAuthHandler
from textblob import TextBlob
import json
import csv

#tweepy keys
CK = ''
CS = ''
AK = ''
AS = ''

#set up tokens
auth = tweepy.OAuthHandler(CK, CS)
auth.set_access_token(AK, AS)

#make our code wait in case it hits our rate limits
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

#search term (filters out retweets)
new_search = 'coronavirus -filter:retweets'

#create list to store tweets in
tweets = []

#search for tweets based on search term, language, and location, and append to the list
for tweet in tweepy.Cursor(api.search,
              q=new_search,
              lang='en',
              geocode='39.50,-98.35,2000mi').items(300):
    tweets.append(tweet)

#geocode the location of the tweets
def geo(location):
    g = geocoders.Nominatim(user_agent='emge', timeout=None)
    try:
        loc = g.geocode(location)
        return loc.latitude, loc.longitude
    except:
        return None

#write the tweet info to a csv
def WriteCSV(user, text, sentiment, lat, long):
    f = open('tran_emge_tweets.csv', 'a', encoding="utf-8")
    write = csv.writer(f)
    write.writerow([Author, Text, Sentiment, Lat, Long])
    f.close()

#defines variables for the tweet's author, text, sentiment, and location, and appends it to the csv
for tweet in tweets:
    Author = tweet.author.screen_name
    Text = TextBlob(tweet.text)
    Sentiment = Text.sentiment.polarity
    if tweet.user.location is not None:
        if geo(tweet.user.location) is not None:
            Lat,Long = geo(tweet.user.location)
            WriteCSV(Author, Text, Sentiment, Lat, Long)
    else:
        print('No location information')


Unnamed: 0,User,Tweet,Sentiment,Lat,Long
0,skirchy,@jessbrammar @jane49578228 @high_fades @JuliaH...,0.0,38.894893,-77.036553
1,Maku1316,SPRING BREAK CANCELED!? Miami Calls Coronaviru...,0.2,36.701463,-118.755997
2,AspergersAreUs,@astradisastra thank you for being the only pe...,0.0,42.360253,-71.058291
3,rtw702,@messquire23 How could you possibly spend any ...,-0.15625,22.736266,-81.815918
4,Divi9elyTay,Oh heck! 😩😵😷\r\n\r\nFirst 2 cases of coronavir...,0.325,32.329381,-83.113737


In [6]:
#print the results in a pandas table
import pandas

tweets = pandas.read_csv('tran_emge_tweets.csv', names=['User', 'Tweet', 'Sentiment', 'Lat', 'Long'])
tweets

Unnamed: 0,User,Tweet,Sentiment,Lat,Long
0,cullinwible,@traderstewie Comparing the flu to the #corona...,0.000000,40.075738,-74.404162
1,Dizzedcom,"Coronavirus updates: Deaths pass 3,000, half o...",-0.166667,40.712728,-74.006015
2,EntwistleF,G7 finance ministers plan call on coronavirus ...,0.000000,50.000678,-86.000977
3,newsfilterio,Cruise line stocks fall as coronavirus cases i...,0.000000,40.712728,-74.006015
4,lame_uhl,don’t wash your hands because you’re afraid of...,-0.250000,32.329381,-83.113737
5,EastIslipPatch,Also: Rare snowy owl spotted / Coronavirus can...,0.050000,40.732082,-73.188330
6,eastvillagetwt,NYC doctor has to 'plead' with health departme...,0.000000,40.729269,-73.987361
7,jamiaw,Coronavirus forces MTA to give the subway a th...,0.000000,-1.291466,36.783078
8,GuardianAus,The first economic modelling of coronavirus sc...,-0.183333,-24.776109,134.755000
9,jobinindia,Uber tells drivers to stay home if they have t...,0.000000,43.653963,-79.387207


In [8]:
#sources in addition to documentation for libraries:
#https://stackoverflow.com/questions/56876620/unsure-how-to-use-colormap-with-folium-marker-plot
#https://www.kaggle.com/daveianhickey/how-to-folium-for-maps-heatmaps-time-data

import folium
import pandas
from folium import plugins
from folium import FeatureGroup, LayerControl
from folium.plugins import HeatMap
import branca
import branca.colormap as cm

#import csv
tweets = pandas.read_csv('tran_emge_tweets.csv', names=['User', 'Tweet', 'Sentiment', 'Lat', 'Long'])

#generate folium map
map = folium.Map(location=[39.50, -98.35],
              zoom_start=3,
              tiles='Stamen toner')

#colorscale points layer
points_layer = FeatureGroup(name = 'Tweet Points (Red = Negative Sentiment, Blue = Positive Sentiment)')

#set color scale
colormap = cm.LinearColormap(colors=['red','blue'], index=[-1,1],vmin=-1,vmax=1)

#define variables
lat = tweets.Lat
lon = tweets.Long
pow = tweets.Sentiment

#define symbolization
for loc, p in zip(zip(lat, lon), pow):
    folium.Circle(
        location=loc,
        radius=10,
        fill=True,
        color=colormap(p),
        fill_opacity=0.7
    ).add_to(points_layer)

#heatmap layer
heatmap_layer = FeatureGroup(name = 'Tweet Heatmap')
tweet_locations = tweets[['Lat', 'Long']].values
HeatMap(tweet_locations).add_to(heatmap_layer)

colormap.add_to(map)
points_layer.add_to(map)
heatmap_layer.add_to(map)
folium.LayerControl(collapsed=True).add_to(map)

map