# C-More

### MongoDB

In [1]:
import pymongo

In [2]:
client = pymongo.MongoClient('mongodb://localhost:27017/')

#### 1. Create database

In [4]:
# create database rep_analysis_test

db = client['rep_analysis_test']

In MongoDB, a **database** is not created until it gets content.

#### 2. Create collection

In [5]:
# create search_words collection (similar to an RDBMS table)

search_words = db['search_words']

In [6]:
db.list_collection_names()

[]

In MongoDB, a **collection** is not created until it gets content.

In [7]:
# insert search words defined by each client into collection search_words

new_search_words = [{"_id": 1, "company": "Vodafone", "words": ["vodafone", "5G"]}, 
                    {"_id": 2, "company": "Santander", "words": ["santander", "card", "account", "loan", "banking"]}, 
                    {"_id": 3, "company": "BP", "words": ["bp", "shell", "repsol", "galp", "prio"]}]

result = search_words.insert_many(new_search_words)

In [8]:
result.inserted_ids

[1, 2, 3]

In [9]:
db.list_collection_names()

['search_words']

#### 3. Select data from a collection

In [10]:
for doc in search_words.find():
    print(doc)

{'_id': 1, 'company': 'Vodafone', 'words': ['vodafone', '5G']}
{'_id': 2, 'company': 'Santander', 'words': ['santander', 'card', 'account', 'loan', 'banking']}
{'_id': 3, 'company': 'BP', 'words': ['bp', 'shell', 'repsol', 'galp', 'prio']}


In [11]:
# find words for company vodafone

my_query = {"company": "Vodafone"}

for words in search_words.find(my_query, {"_id": 0, "company": 0}):
    print(words['words'])

['vodafone', '5G']


#### 4. Check collections after running the .py script to get twitter data

In [12]:
db.list_collection_names()

['keywords', 'search_words']

#### 5. Select data from the keywords collection

In [14]:
keywords = db['keywords']

In [15]:
for doc in keywords.find():
    print(doc)

{'_id': ObjectId('62f28544fe960096a8b1e236'), 'created_at': '2022-08-08T23:26:46.000Z', 'text': '@VodafoneOMN Please let me know if any vacancy available in Vodafone oman family', 'lang': 'en', 'id': '1556784011619565573', 'public_metrics': {'retweet_count': 0, 'reply_count': 1, 'like_count': 0, 'quote_count': 0}}
{'_id': ObjectId('62f28544fe960096a8b1e237'), 'created_at': '2022-08-08T23:08:57.000Z', 'text': 'The TL top make ass ass Ei but Vodafone no dey make adey watch am well oh💔🤦🏽\u200d♂️.', 'lang': 'en', 'id': '1556779526935511040', 'public_metrics': {'retweet_count': 1, 'reply_count': 1, 'like_count': 10, 'quote_count': 0}}
{'_id': ObjectId('62f28544fe960096a8b1e238'), 'created_at': '2022-08-08T23:07:31.000Z', 'text': "@t0mm13b It's the Vodafone Gigabox https://t.co/8M8a6c52PQ and this is the fritz box model https://t.co/YNPkW0FR7V\n\nIt's FTTC for the minute, the main device hands out the DHCP which is the VF device", 'lang': 'en', 'id': '1556779164954394624', 'public_metrics': 

In [16]:
# total number of documents

keywords.count_documents({})

69

In [22]:
# create text index to perform $text queries
# https://stackoverflow.com/questions/33541290/how-can-i-create-an-index-with-pymongo
# https://pymongo.readthedocs.io/en/stable/api/pymongo/collection.html#pymongo.collection.Collection.create_index

keywords.create_index([('text', pymongo.TEXT)])

'text_text'

In [23]:
cursor = keywords.find({"$text": {"$search": "vodafone"}}, {"text": 1, '_id': 0})

i=0

for x in cursor:
    i+=1
    print(x['text'])
    
print('\n --> ' + str(i) + ' documents in total')

@VodafoneOMN Please let me know if any vacancy available in Vodafone oman family
The TL top make ass ass Ei but Vodafone no dey make adey watch am well oh💔🤦🏽‍♂️.
@Simondoestweets Hmm, am not familiar with Fritzbox 😕what is the model of fritzbox and Vodafone modem? Am assuming it's FTTH? Which device dishes out DHCP addresses?
#Konekt on all networks today! 🤩 #MakeTheSwitch to Vodafone and take advantage of our new #WantokKombo plans!! 

#AllNetworkPlans #MakeTheSwitch #TogetherWeCan #VodafonePNG https://t.co/K99JXI01hY
@t0mm13b It's the Vodafone Gigabox https://t.co/8M8a6c52PQ and this is the fritz box model https://t.co/YNPkW0FR7V

It's FTTC for the minute, the main device hands out the DHCP which is the VF device

 --> 5 documents in total


In [24]:
cursor = keywords.find({"$text": {"$search": "5G"}}, {"text": 1, '_id': 0})
    
i=0

for x in cursor:
    i+=1
    print(x['text'])
    
print('\n --> ' + str(i) + ' documents in total')

#5G News Via @7GTech 5g networks: The rationale for Indian companies needing their own 5G networks, Telecom News, ET Telecom https://t.co/5bFmsjVio0, see more https://t.co/jo3q1pfSzq
#5G @rvp @7GTech @5GAIoT @C4ISRT @EMF5G  5g networks: The rationale for Indian companies needing their own 5G networks, Telecom News, ET Telecom https://t.co/gQHiQbiORe, see more https://t.co/hpN2zqa3iX
Tested @enjoyGLOBE 5G here in BGC and @DITOphofficial 5G. 

What a surprise. https://t.co/Npw8mgiaYE
5G Explained 
Digital Insights Podcast🎙️10

The introduction of #5G brings faster speeds, significantly more capacity, mobile edge computing, and lower latency to the world of smartphones and cellular IoT devices. The use of different wireless https://t.co/ATONhzRzmN
5G is a lie
@FotoNugget Na 5G network 😂😂
@SimplyExtraoz Snapdragon 680 is not a 5g chipset
fucking wifi!!!!!!! 5G is a lie in latam!!!!
@BlackberryXRP NYLink lol how they track your moments via 5G
Samsung M32 5G seems like a fairly good phone.
R

In [25]:
cursor = keywords.aggregate([{'$match': {"$text": {"$search": "5G"}}}, {"$count": "Number of documents"}])

for result in cursor:
    print(result)

{'Number of documents': 60}


For the words 'vodafone' and '5G' we have a total of 65 documents (5 + 60). However, we saw the total number of documents we retrieved was 69.

This happens because `$search` is case sensitive in MongoDB. We can use regular expressions to retrieve all the documents we are interested in.

In [26]:
my_query = {"text": {"$regex": "(V|v)odafone"}}

cursor = keywords.find(my_query)

i=0

for x in cursor:
    i+=1
    print(x['text'])
    
print('\n --> ' + str(i) + ' documents in total')

@VodafoneOMN Please let me know if any vacancy available in Vodafone oman family
The TL top make ass ass Ei but Vodafone no dey make adey watch am well oh💔🤦🏽‍♂️.
@t0mm13b It's the Vodafone Gigabox https://t.co/8M8a6c52PQ and this is the fritz box model https://t.co/YNPkW0FR7V

It's FTTC for the minute, the main device hands out the DHCP which is the VF device
#Konekt on all networks today! 🤩 #MakeTheSwitch to Vodafone and take advantage of our new #WantokKombo plans!! 

#AllNetworkPlans #MakeTheSwitch #TogetherWeCan #VodafonePNG https://t.co/K99JXI01hY
@Simondoestweets Hmm, am not familiar with Fritzbox 😕what is the model of fritzbox and Vodafone modem? Am assuming it's FTTH? Which device dishes out DHCP addresses?
Hello @ThreeUK can’t wait until December to cancel my contract with yous, your 5G is so poor, your signal is so poor, when all my friends with @EE and @VodafoneUK have excellent connections, i’m the only one struggling because i’m the only one with yous, can’t wait to cancel

In [27]:
my_query = {"text": {"$regex": "5(G|g)"}}

cursor = keywords.find(my_query)

i=0

for x in cursor:
    i+=1
    print(x['text'])
    
print('\n --> ' + str(i) + ' documents in total')

5G Auction: Reliance Jio Submits Earnest Money Deposit Of Rs 14,000 Crore https://t.co/92GL9F3ntl
Hello @ThreeUK can’t wait until December to cancel my contract with yous, your 5G is so poor, your signal is so poor, when all my friends with @EE and @VodafoneUK have excellent connections, i’m the only one struggling because i’m the only one with yous, can’t wait to cancel
@MentalHealerid Soft Case Glitter Transparan + Stand Holder + Lanyard Untuk Samsung Galaxy A72 A52 A32 5g 4g A12 A02 A02S A71 A51 A21S M02 M12 2021 Rp63,293 https://t.co/siNisklnwV https://t.co/VjkLtkkV20
@BlackberryXRP 5G small cells keeping track of your every move
Tested @enjoyGLOBE 5G here in BGC and @DITOphofficial 5G. 

What a surprise. https://t.co/Npw8mgiaYE
Amazon Sale On Oneplus Best OnePlus 10T 5G Phone Camera Best Oneplus Phone Under 30000 Amazon Great Freedom Festival https://t.co/BmAx473iuS
@Xfinity  
@xfinitysupport
 it took ~40 minute for chat agent to tell me my newly bought unlocked Galaxy A13 5G is n

In alternative, we can also use Python's re module.

In [28]:
import re

In [29]:
my_query = {"text": {"$regex": re.compile('vodafone', re.IGNORECASE)}}

cursor = keywords.find(my_query)

i=0

for x in cursor:
    i+=1
    print(x['text'])
    
print('\n --> ' + str(i) + ' documents in total')

@VodafoneOMN Please let me know if any vacancy available in Vodafone oman family
The TL top make ass ass Ei but Vodafone no dey make adey watch am well oh💔🤦🏽‍♂️.
@t0mm13b It's the Vodafone Gigabox https://t.co/8M8a6c52PQ and this is the fritz box model https://t.co/YNPkW0FR7V

It's FTTC for the minute, the main device hands out the DHCP which is the VF device
#Konekt on all networks today! 🤩 #MakeTheSwitch to Vodafone and take advantage of our new #WantokKombo plans!! 

#AllNetworkPlans #MakeTheSwitch #TogetherWeCan #VodafonePNG https://t.co/K99JXI01hY
@Simondoestweets Hmm, am not familiar with Fritzbox 😕what is the model of fritzbox and Vodafone modem? Am assuming it's FTTH? Which device dishes out DHCP addresses?
Hello @ThreeUK can’t wait until December to cancel my contract with yous, your 5G is so poor, your signal is so poor, when all my friends with @EE and @VodafoneUK have excellent connections, i’m the only one struggling because i’m the only one with yous, can’t wait to cancel

In [30]:
my_query = {"text": {"$regex": re.compile('5g', re.IGNORECASE)}}

cursor = keywords.find(my_query)

i=0

for x in cursor:
    i+=1
    print(x['text'])
    
print('\n --> ' + str(i) + ' documents in total')

5G Auction: Reliance Jio Submits Earnest Money Deposit Of Rs 14,000 Crore https://t.co/92GL9F3ntl
Hello @ThreeUK can’t wait until December to cancel my contract with yous, your 5G is so poor, your signal is so poor, when all my friends with @EE and @VodafoneUK have excellent connections, i’m the only one struggling because i’m the only one with yous, can’t wait to cancel
@MentalHealerid Soft Case Glitter Transparan + Stand Holder + Lanyard Untuk Samsung Galaxy A72 A52 A32 5g 4g A12 A02 A02S A71 A51 A21S M02 M12 2021 Rp63,293 https://t.co/siNisklnwV https://t.co/VjkLtkkV20
@BlackberryXRP 5G small cells keeping track of your every move
Tested @enjoyGLOBE 5G here in BGC and @DITOphofficial 5G. 

What a surprise. https://t.co/Npw8mgiaYE
Amazon Sale On Oneplus Best OnePlus 10T 5G Phone Camera Best Oneplus Phone Under 30000 Amazon Great Freedom Festival https://t.co/BmAx473iuS
@Xfinity  
@xfinitysupport
 it took ~40 minute for chat agent to tell me my newly bought unlocked Galaxy A13 5G is n