The last time, we learnt that the features or attributes that are associated with a single tweet can be displayed using the dir() function 

![](https://i.imgur.com/nzL4SSS.jpg)

if we tried for examples things like:

In [None]:
print(tweets[0].retweet_count)

it should show us the number of how many times tweet 0 was retweeted. 
We can access the other attributes the same way. 

If we, for example, want to store the id of the tweet along with the tweet text, we can do a small change in the function we defined earlier which takes the texts of tweet json stream and stores it in a dataframe. 
The edits on the newer version of this function will be: 
- we take the tweets that we got from the function, and loop through them.
- while looping through each and every tweet, we tell it to give us the id of each tweet.
- we then store that in an array. 
- we convert the array to numpy array. 
- we create a column in our data-frame, and we call it 'id'.


In [None]:
class TweetAnalyzer():
    def tweets_to_dataframe(self, tweets):
        df = pd.DataFrame(data=[t.text for t in tweets], columns=['Tweets'])
        df['id'] = np.array([t.id for t in tweets]) # <<<<<<<<<<<<

        return df

then to test if things are working right, print the head of the data-frame again in the main part of the code.

In [None]:
print(df.head(10))

![](https://i.imgur.com/nj9hFfe.jpg)

Now we have the corresponding id for each tweet. 

Following this way, we can just keep extracting whatever information we think are useful for us. 

In [None]:
class TweetAnalyzer():
    def tweets_to_dataframe(self, tweets):
        df = pd.DataFrame(data=[t.text for t in tweets], columns=['Tweets'])

        df['id'] = np.array([t.id for t in tweets])
        df['len'] = np.array([len(t.text) for t in tweets])
        df['date'] = np.array([t.created_at for t in tweets])
        df['source'] = np.array([t.source for t in tweets])
        df['likes'] = np.array([t.favorite_count for t in tweets])
        df['retweets'] = np.array([t.retweet_count for t in tweets])

        return df

Other kind of information that we can get are such as: 
- the average length of all extracted tweets.

Very simple: 

In [None]:
print(np.mean(df['len']))

Or for the tweet that received the most likes, how many likes were they: 

In [None]:
print(np.max(df['likes']))

We may want to try data visualization, or just a simple plotting of time-series with any of the attributes we want. 

[Matplotlib](https://matplotlib.org/) is the library to import in this case. 

In [None]:
import matplotlib.pyplot as plt

time_favs = pd.Series(data=df['likes'].values, index=df['date'])
time_favs.plot(figsize=(16, 4), color='r')
plt.show()

![](https://i.imgur.com/b6OjqEY.jpg)

We can also do time-series for other things the same way as above, for length, number of retweets, ...etc.

Another one for retweets will look something like this:

In [None]:
time_retweet = pd.Series(data=df['retweets'].values, index=df['date'])
time_retweet.plot(figsize=(16, 4), color='r')
plt.show()

![](https://i.imgur.com/d4h1eTR.jpg)

You may want to pay more attention to the peaks and possible cause behind outliers that show. 

We can also combine both plots: 

In [None]:
time_retweet = pd.Series(data=df['retweets'].values, index=df['date'])
time_retweet.plot(figsize=(16, 4), label='retweet', legend=True)

time_likes = pd.Series(data=df['likes'].values, index=df['date'])
time_likes.plot(figsize=(16, 4), label='likes', legend=True)
    
plt.show()

![](https://i.imgur.com/C8FjgT7.jpg)

Because when we plot more than one variable together, we might be able to see any possible corrolations between them. 