Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Needs only user tweets - Yardim ? #5

Open
wahshibinharb opened this issue Jul 21, 2020 · 6 comments
Open

Needs only user tweets - Yardim ? #5

wahshibinharb opened this issue Jul 21, 2020 · 6 comments

Comments

@wahshibinharb
Copy link

Hello and tesekkurler for sharing!

Script works fine, Problem is when run the script it's collecting all user tweets, retweets and comments. When i tried to run the script for a year data it just gets me for 1000 tweets.. so far so good. Problem is i am getting 2 months tweets cause of its getting all data(tweets,retweets and comment).

It's possible to get only user tweets. how to do ?

Thanks again

@sarikayamehmet
Copy link
Owner

For user search use that
searchTerm <- "(from%3ArealDonaldTrump)" # for user search

Also increase below parameter to get more data
ntweets = 1000

@wahshibinharb
Copy link
Author

wahshibinharb commented Jul 22, 2020

Thank you!

Is there a way to see how many times a certain phrase or group of words have been tweeted?

Example : I wanna know "ayasofya" word how many times tweeted between this days etc.

@sarikayamehmet
Copy link
Owner

You can use below code to take number of tweets:

# Input parameters
startdate =  "2020-07-23"
enddate = "2020-07-24"
language = "tr"
searchTerm <- "ayasofya"
searchbox <- URLencode(searchTerm)
# convert to url
temp_url <- paste0("https://twitter.com/i/search/timeline?f=tweets&q=",searchbox,"%20since%3A",startdate,"%20until%3A",enddate,"&l=",language,"&src=typd&max_position=")

webpage <- fromJSON(temp_url)
if(webpage$new_latent_count>0){
  tweet_ids <- read_html(webpage$items_html) %>% html_nodes('.js-stream-tweet') %>% html_attr('data-tweet-id')
  breakFlag <- F
  while (webpage$has_more_items == T) {
    tryCatch({
      min_position <- webpage$min_position
      next_url <- paste0(temp_url, min_position)
      webpage <- fromJSON(next_url)
      next_tweet_ids <- read_html(webpage$items_html) %>% html_nodes('.js-stream-tweet') %>% html_attr('data-tweet-id')
      next_tweet_ids <- next_tweet_ids[!is.na(next_tweet_ids)]
      tweet_ids <- unique(c(tweet_ids,next_tweet_ids))
    },
    error=function(cond) {
      message(paste("URL does not seem to exist:", next_url))
      message("Here's the original error message:")
      message(cond)
      breakFlag <<- T
    })
    
    if(breakFlag == T){
      break
    }
  }
} else {
  paste0("There is no tweet about this search term!")
}
print(length(tweet_ids))

@wahshibinharb
Copy link
Author

wahshibinharb commented Jul 24, 2020

Result :

URL does not seem to exist: https://twitter.com/i/search/timeline?f=tweets&q=ayasofya%20since%3A2020-07-23%20until%3A2020-07-24&l=tr&src=typd&max_position=thGAVUV0VFVBaEgLvxj5WO2iMWgIC7meGVstojEjUAFQAlAFUAFQAVARUAFQAA
Here's the original error message:
HTTP error 429.

it makes me wait like a 10 min or so then fails

@wahshibinharb
Copy link
Author

Also i am getting this

URL does not seem to exist: https://twitter.com/i/search/timeline?f=tweets&q=ayasofya%20since%3A2020-07-23%20until%3A2020-07-24&l=tr&src=typd&max_position=thGAVUV0VFVBaAwL2BhYGY2iMWgIC7meGVstojEjUAFQAlAFUAFQAVARUAFQAA
Here's the original error message:
HTTP error 429.

print(length(tweet_ids))
[1] 9400

@M-1993-fsu
Copy link

Hi,
I tried to use your script today but after the line "webpage <- fromJSON(temp_url)", I get the following error "Error in open.connection(con, "rb") : HTTP error 404.". If I try to use the URL on my browser it turns out the page doesn't exist anymore. Is this the actual issue? How should I correct this? Maybe it's trivial but I just started using R and Twitter API last week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants