Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collecting historical data from Twitter (tweets with stock tickers/symbols = $AAPL and Twitter feeds/news = Apple Inc. #AAPL OR @Apple) #3

Closed
DusanStevic opened this issue Mar 1, 2021 · 0 comments · Fixed by #58
Assignees
Labels
Data acquisition and collection Data acquisition and collection Sentiment analysis subsystem Sentiment analysis subsystem Tools and libraries Tools and libraries

Comments

@DusanStevic
Copy link
Collaborator

DusanStevic commented Mar 1, 2021

Twitter Symbols:

  1. #AAPL
  2. @apple
  3. $AAPL

Twitter limitations and restrictions:

  1. Twitter has provided REST API's which can be used by developers to access and read Twitter data. They have also provided a Streaming API which can be used to access Twitter Data in real-time. Most of the software written to access Twitter data provide a library which functions as a wrapper around Twitter's Search and Streaming API's and are therefore constrained by the limitations of the API's. With Twitter's Search API you can only send 180 Requests every 15 minutes. With a maximum number of 100 tweets per Request, you can mine 72 tweets per hour (4 x 180 x 100 =72) . By using TwitterScraper you are not limited by this number but by your internet speed/bandwith and the number of instances of TwitterScraper you are willing to start. One of the bigger disadvantages of the Search API is that you can only access Tweets written in the past 7 days. This is a major bottleneck for anyone looking for older data. With TwitterScraper there is no such limitation (Twitter API approach).
  2. Unfortunately, these scrapers rely on Twitter’s front end, meaning that if there are changes to the front end, the scrapers stop working (Twitter scrapping approach).

Twitter API approach:
Twitter developer account
Tweepy: An easy-to-use Python library for accessing the Twitter API.
Twitter REST API problem: Twitter's API limitations and restrictions.
Twitter Stream API problem: Twitter's API limitations and restrictions.

Twitter scrapping approach:
https://pypi.org/project/GetOldTweets3/ problem: HTTP Error, Gives 404 but the URL is working.
https://github.com/Jefferson-Henrique/GetOldTweets-python problem: HTTP Error, Gives 404 but the URL is working.

https://pypi.org/project/twitterscraper/ problem: HTTP Error, Gives 404 but the URL is working.
https://github.com/taspinar/twitterscraper problem: HTTP Error, Gives 404 but the URL is working.

https://github.com/twintproject/twint success: Works like a charm.
https://medium.datadriveninvestor.com/scrape-twitter-without-limits-using-twint-92509f2503cd success: Works like a charm.
Twint basic usage success: Works like a charm.
Tweet attributes success: Works like a charm.
Twitter Symbols:

  1. The "#" symbol is called a "hashtag" e.g. #AAPL (The "#" symbol is called a "hashtag." People use the hashtag to attach a theme or "tag" to their tweets, usually in connection with a trending or popular topic, such as #[Company]IPO or #Marketing.)
  2. The "@" symbol means what it appears to mean: "at" e.g. @Apple (The intention of the @ is to get someone's attention)
  3. The "$" symbol is called a "cashtag" e.g. $AAPL (Twitter uses cashtags ($) to track tweets on stock tickers)
@DusanStevic DusanStevic added the Data acquisition and collection Data acquisition and collection label Mar 1, 2021
@DusanStevic DusanStevic added this to To do in Investment Decision Support System via automation Mar 1, 2021
@DusanStevic DusanStevic self-assigned this Mar 1, 2021
@DusanStevic DusanStevic added this to the First control point milestone Mar 1, 2021
@DusanStevic DusanStevic added the Sentiment analysis subsystem Sentiment analysis subsystem label Mar 1, 2021
@DusanStevic DusanStevic added the Tools and libraries Tools and libraries label Mar 9, 2021
@DusanStevic DusanStevic changed the title Collecting historical data from Twitter Collecting historical data from Twitter (tweets with stock tickers and Twitter feeds) Mar 9, 2021
@DusanStevic DusanStevic changed the title Collecting historical data from Twitter (tweets with stock tickers and Twitter feeds) Collecting historical data from Twitter (tweets with stock tickers/symbols and Twitter feeds/news) Mar 9, 2021
@DusanStevic DusanStevic changed the title Collecting historical data from Twitter (tweets with stock tickers/symbols and Twitter feeds/news) Collecting historical data from Twitter (tweets with stock tickers/symbols = $AAPL and Twitter feeds/news = Apple Inc.) Mar 9, 2021
@DusanStevic DusanStevic changed the title Collecting historical data from Twitter (tweets with stock tickers/symbols = $AAPL and Twitter feeds/news = Apple Inc.) Collecting historical data from Twitter (tweets with stock tickers/symbols = $AAPL and Twitter feeds/news = Apple Inc. #AAPL OR @Apple) Apr 24, 2021
@DusanStevic DusanStevic linked a pull request May 1, 2021 that will close this issue
@DusanStevic DusanStevic moved this from To do to Review in progress in Investment Decision Support System May 1, 2021
@DusanStevic DusanStevic moved this from Review in progress to Done in Investment Decision Support System May 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data acquisition and collection Data acquisition and collection Sentiment analysis subsystem Sentiment analysis subsystem Tools and libraries Tools and libraries
Development

Successfully merging a pull request may close this issue.

1 participant