# Using **pyTwitterDB** library 

#### *This is a sample of how to use pyTwitterDB library*

***

#### **Requirements:**
1. Python 32bits

2. **Database**: MongoDB

3. **Libraries:**
 + pymongo 
 + NLTK  
 + numpy  
 + gensim
 + sklearn
 + csv
 + string
 + json


#####   
##### Importing library

In [None]:
import pyTwitterDB
from pymongo import MongoClient

#####      
##### Setting Db connection

In [None]:
#db connection
mongoDBConnectionSTR = "mongodb://localhost:27017"
client = MongoClient(mongoDBConnectionSTR)
db = client.twitter_DB_H #chose your DB name here

#####    
##### Defining object of type pyTwitterDB  /  And preparing settings for twitter analysis

In [None]:
#define settings
fieldsConfig = "id_str;created_at;lang;reply_count;retweet_count;in_reply_to_status_id_str;in_reply_to_screen_name"
fieldsUsrConfig =  "name;screen_name;description;location;followers_count;friends_count;statuses_count;lang;verified"

# define object of type pyTwitterDB
x = pyTwitterDB.pyTwitterDB_class(db, fieldsConfig,fieldsUsrConfig)


#####  
##### Loading tweets from files into MongoDb

In [None]:
x.loadDocFromFile("C:\\Data\\tweetAnalysis-Summer19\\tst2")

######   
#### ******** **Suggestion**: Before running next step, add an index on column "seq_no" on "tweet" collection*


#####    
##### Loading Focused Data into MongoDB
*This function will use the seetings set on fieldsConfig and fieldsUsrConfig variables* 

In [None]:
x.loadFocusedData(100000)

######   
#### ******** **Suggestion**: Before running next step, add an index on column "seq_no" on "focusedTweet" collection*

#####    
##### Breaking tweets into Words
*The parameter is the number of tweets it will process at a time. If you see any error, lower this number* 

In [None]:
x.loadWordsData(30000)

#####    
##### Loading Aggregations into MongoDB
*To add more types of aggregation, just create a new function and call it from "loadAggregations". You can follow similar logic from the exiting ones*

In [None]:
x.loadAggregations('tweetCountByFile')
x.loadAggregations('hashtagCount')
x.loadAggregations('tweetCountByLanguageAgg')
x.loadAggregations('tweetCountByPeriodAgg')
x.loadAggregations('tweetCountByUser')

#   
### Exporting data into files

######    
##### Exporting aggregations into | delimeted files. (These files can be opened as csv format)

In [None]:
exportPath = 'C:\\Data\\tweetAnalysis-Summer19\\tst2exports'

x.exportData('tweetCountByFile', exportPath, 0)
x.exportData('hashtagCount', exportPath, 0)
x.exportData('tweetCountByMonth', exportPath, 0)
x.exportData('tweetCountByLanguage', exportPath, 0)
x.exportData('tweetCountByUser', exportPath, 0)

#####    
##### Exporting every tweet text with its period into | delimeted files. (These files can be opened as csv format)
*since there could be too many tweets for one file, you can set the parameter "inc" with the number of lines that each file should have*

In [None]:
x.exportData('tweetTextAndPeriod', exportPath, 150000)

#####    
##### Exporting every word of each of the tweets into | delimeted files. (These files can be opened as csv format)
*since there could be too many words for one file, you can set the parameter "inc" with the number of lines that each file should have*

In [None]:
x.exportData('wordsOnEachTweet', exportPath, 1000000)

#####    
##### Exporting every tweet text with details of the user into | delimeted files. (These files can be opened as csv format)
*since there could be too many tweets for one file, you can set the parameter "inc" with the number of lines that each file should have*

In [None]:
x.exportData('userDetailsOnEachTweet', exportPath, 100000)

#   
### Topic Analysis

In [None]:
num_topics_lda=10
num_topics_lsi=10
num_topics_nmf=10
max_no_tweets_perHT=70000

######    
##### Running topic analysis with gensim model

In [None]:
x.findTopics(num_topics_lda, num_topics_lsi, 0, max_no_tweets_perHT, "gensim")

######   
##### Running topic analysis with gensim model

In [None]:
x.findTopics(num_topics_lda, num_topics_lsi, num_topics_nmf, max_no_tweets_perHT, "sklearn")

######   
##### Exporting topics by hashtag into file

In [None]:
x.exportData('topicByHashtag', exportPath)