-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to customize json result? - Navigating through json response #13
Comments
If you are using the async-await branch, use the following (also available for the master branch now, check the code below) : from tweeterpy import TweeterPy
from tweeterpy import util
twitter = TweeterPy()
# get tweets or other data
data = twitter.get_user_tweets("elonmusk",total=50)
# get data by keys from the nested python dict, just pass in the dataset and the key name you want to extract.
# NOTE : THERE MIGHT BE MULTIPLE KEYS WITH THE SAME NAME. SAY ID (IT COULD BE ID OF TWEET OR A USER OR A THREAD CONVERSATION ETC. ) TRY TO PASS A UNIQUE KEY, OR JUST PASS A DATASET WITH UNIQUE KEYS.
usernames = util.find_nested_key(data,"screen_name") If you are using the master branch. Just define the following code in your project and use it as a normal function. #> Its available now for the master branch as well. Import it from tweeterpy.utils module. Check the code below for more details.
from functools import reduce
def find_nested_key(dataset=None, nested_key=None):
def get_nested_data(dataset, nested_key, placeholder):
if isinstance(dataset, list) or isinstance(dataset, dict) and dataset:
if isinstance(dataset, list):
for item in dataset:
get_nested_data(item, nested_key, placeholder)
if isinstance(dataset, dict):
if isinstance(nested_key, tuple) and nested_key[0] in dataset.keys():
placeholder.append(reduce(lambda data, key: data.get(
key, {}), nested_key, dataset) or None)
placeholder.remove(None) if None in placeholder else ''
# placeholder.append(reduce(dict.get,nested_key,dataset))
if isinstance(nested_key, str) and nested_key in dataset.keys():
placeholder.append(dataset.get(nested_key))
for item in dataset.values():
get_nested_data(item, nested_key, placeholder)
return placeholder
return get_nested_data(dataset, nested_key, [])
tweets_text = find_nested_key(data,"full_text") Edit : You don't have to do it manually anymore. It has been implemented to the master branch as well. from tweeterpy import TweeterPy
from tweeterpy.util import find_nested_key
data = twitter.get_user_tweets("elonmusk",total=50)
usernames = util.find_nested_key(data,"screen_name")
tweets_text = find_nested_key(data,"full_text")
# Just updated find_nested_key function to accept nested_key as a tuple as well.
tweets_creation = util.find_nested_key(data,("tweet_results","result","legacy","created_at")) |
thank you for your hard if i can ask you how much tweets can i scrap by month |
To check the rate limits, just use the async-await branch and pretty much all of the functions have an argument "return_rate_limit". Just set it to True. Take a look at this #8 .It will return the hourly rate limits, you can also google twitter api rate limits and you will get an idea of the requests you can make a day. NOTE : If the rate limit is like 2000 per day that doesnt mean you can get only 2000 tweets a day. It means you can make 2000 requests a day. The data you can get depends on the type of data you are requesting for. Say if you are requesting for user_data, each request returns data for each user so that will be 2000 users in this case. But in case of tweeets, sometimes each request returns 30-50 or other times it does return like 100 tweets. So its better you keep an eye on those rate limits. The best way is to make a request and then request for the api limit stats to check how many requests did it cost. # to check rate limits for user friends.
twitter.get_friends('',follower=True,return_rate_limit=True)
# to check limits for user tweets.
twitter.get_user_tweets('',return_rate_limit=True)
# it will return the total number of api calls allowed and the remaining api calls. You can get an idea from there. Check this guide to switch to the async-await branch. Feel free to close the issue if you got what you were looking for. |
using createdat=util.find_nested_key(user_tweets,"created_at") |
As I mentioned earlier, there might be multiple keys with the similar name in a single dataset, "created_at" key is used for the users and also for the tweets as well. You can just use a for loop. The nested location of creation_at for tweets is at ['content']['itemContent']['tweet_results']['result']['legacy']. So a quick fix in your case is: user_tweets = twitter_user_tweets("elonmusk",total=20)
[util.find_nested_key(tweet['content']['itemContent']['tweet_results']['result']['legacy'],"created_at") for tweet in user_tweets[0]['data']] Edit : Just updated find_nested_key function to accept nested_key as a tuple as well. user_tweets = twitter_user_tweets("elonmusk",total=20)
tweets_creation = util.find_nested_key(user_tweets,("tweet_results","result","legacy","created_at")) |
if is it possible to know where is the data list or dictionnary that contains all the keys just to know for example what is the keys to use to have only the url of the post or id of the post . and thanks for being up to date |
Hey @ihabpalamino You can take a look at the official Twitter API website if they have posted some sample responses. Otherwise you gonna have to navigate through the response yourself to understand those key, values pairs. Just grab one of the results from the list, other results are quite similar most of the times. I just updated the find_nested_key function. Now it takes nested_key as a tuple as well. Its easier this way to deal with multiple similar keys. Check the usage here |
thanks but stay cant get url of each post that direct me to see the tweet |
Twitter doesn't send the direct url to tweets in this dataset. You will have to create it on your own. The tweets url structure is : user_tweets = twitter.get_user_tweets("elonmusk",total=10)
for user in user_tweets:
for tweet in user["data"]:
#skip promoted tweets
if tweet.get("entryId","").startswith("promote"):
continue
tweet_id = util.find_nested_key(tweet,("tweet_results","result","rest_id"))
username = util.find_nested_key(tweet,("user_results","result","legacy","screen_name"))
if tweet_id and username:
print(f"https://www.twitter.com/{username[0]}/status/{tweet_id[0]}") |
Assuming this is what you were looking for, I am closing the issue. |
Hello @iSarabjitDhiman i would like to ask how can i customize the json result to get only id content date of ceation number of likes and retweet and url of the post intead of the whole thing thanks!!
The text was updated successfully, but these errors were encountered: