# **Final Project - Part 1**
## Data Collection
*By Ethan Lee, Tyler Ngo, Diane Li, James Dai*

The first part of our submission details the code that was
used to collect the stock sentiment data that is included in
the submission.

We begin by importing the necessary files.

In [19]:
import requests
import os
from google.cloud import language_v1
import csv

We then proceed with both authorization and authentication
through the use of a bearer token. This token is generated
when an individual creates a Twitter developer account for
purpose of utilizing the API. We then make a request to the
Twitter api for the information that we are looking for. The
bearer token is included in this request as a header. The
results of the request are stored in a json file. Note that
one of the arguments given to the API request is the date that
we plan on parsing.

In [13]:
bearer = open('bearer.txt', 'r').read()
h = {
    "Authorization": bearer
}

result = requests.get("https://api.twitter.com/1.1/search/tweets.json?q=iphone&lang=en&until=2020-12-18&include_entities=true&count=100", headers=h)
tweets = result.json()["statuses"]
print("length:", len(tweets))

length: 100


We now parse the json file into a list of individual tweets.

In [20]:
tweets_content = []
sentiments = []
for t in tweets:
    text = t["text"]
    # print(text + "\n")
    tweets_content.append(t["text"])
tweets_content

['RT @DJAB_1998: iPhone shoot📲 \n\nSunset Vibe ✨ https://t.co/irPagaxCLW',
 'RT @UberFacts: Steve Jobs invented the iPhone out of spite because he hated a particular Microsoft employee',
 "RT @jamescharles: Today's Sismas Box includes an iPhone 12 📱👀 Find out how to enter on my Instagram story now!",
 "RT @ZackSnyder: Gal's first day on set in the costume July 2014, shot on my iPhone 6. I knew when I cast her back in the fall of 2013 she w…",
 '@tim_cook How about stereo photography \nAsk @DrBrianMay all about it and put two lenses at either end of the iPhone… https://t.co/5pNvReobeK',
 'Idk what color to do my iPhone aesthetic , but I know I’m ready to change it 😩',
 '@paigewalmsleyx Receiving notifications for messages is important, so we’d like to help. To begin, let’s make sure… https://t.co/2AS9gOFZYX',
 'AND- my copies were made via iPhone scanning app. It was so time consuming but my students were amazing',
 '@SpazzWesson I have an iphone',
 'RT @wccftech: iOS 14.3 Battery Test 

Now we use the tokens and other user data stored in *My First Project-f1fd24b13f57.json*
to authorize and authenticate an API request to Google Cloud's Natural Language
Processing API. This user data is created when an account is made at https://cloud.google.com/natural-language/docs/analyzing-sentiment

In [16]:
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "My First Project-f1fd24b13f57.json"

client = language_v1.LanguageServiceClient()
# # The text to analyze
# text = u"Hello, world!"

We now use this API to calculate the sentiment of each of the tweets in the
list.

In [23]:
for text in tweets_content:

    document = language_v1.Document(content=text, type_=language_v1.Document.Type.PLAIN_TEXT)

    # Detects the sentiment of the text
    sentiment = client.analyze_sentiment(request={'document': document}).document_sentiment

    # print("Text: {}".format(text))
    # print("Sentiment: {}, {}".format(sentiment.score, sentiment.magnitude))
    sentiments.append(sentiment.score)
    print("sentiment score: ", sentiment.score)

sentiment score:  0.20000000298023224
sentiment score:  -0.699999988079071
sentiment score:  0.20000000298023224
sentiment score:  0.0
sentiment score:  0.0
sentiment score:  0.20000000298023224
sentiment score:  0.0
sentiment score:  0.30000001192092896
sentiment score:  -0.10000000149011612
sentiment score:  0.10000000149011612
sentiment score:  0.0
sentiment score:  0.4000000059604645
sentiment score:  0.0
sentiment score:  0.4000000059604645
sentiment score:  -0.6000000238418579
sentiment score:  -0.30000001192092896
sentiment score:  -0.800000011920929
sentiment score:  0.0
sentiment score:  0.30000001192092896
sentiment score:  -0.5
sentiment score:  0.0
sentiment score:  -0.4000000059604645
sentiment score:  0.0
sentiment score:  0.0
sentiment score:  -0.10000000149011612
sentiment score:  0.0
sentiment score:  -0.30000001192092896
sentiment score:  -0.800000011920929
sentiment score:  0.0
sentiment score:  0.0
sentiment score:  0.0
sentiment score:  0.0
sentiment score:  0.0
se

Finally, we write the tweets and the corresponding sentiments into a csv file.
There is one csv file for each date that we parsed.

In [24]:
with open('dec18.csv', 'w') as csvfile:
    writer = csv.writer(csvfile)
    for i in range(len(tweets_content)):
        writer.writerow([tweets_content[i].encode("utf-8"), sentiments[i]])

print("Average sentiment", sum(sentiments) / len(sentiments))


Average sentiment -0.03235294005157901
