# SocialVec Examples

## Import and initializations

The below cell is only needed to import a local version that was not install using pip

In [1]:
import os
import sys
from pathlib import Path
package_dir = os.path.join(Path(os.getcwd()).parent.absolute(),'socialvec')
sys.path.append(os.path.dirname(package_dir))

if you install the package using pip you can simply import it as below

In [2]:
from socialvec.socialvec import SocialVec
#from socialvec.socialvec import SocialVecClassifier

2022-10-20 13:35:16.561164: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [3]:
sv = SocialVec(model_name="2020_2022")

✅  Initialize Model
✅  Load Metadata


## Basic Usage Examples

### Get a vector of a user using twitterid (string or integer), or by username

In [None]:
sv[12]

In [None]:
sv["12"]

In [None]:
sv["jack"]

### Get similar users

In [None]:
sv.get_similar('jack')

### Get the average embeddings of multiple users
When we want to get the embeddings of a user that is not a popular entity, we collect the list of accounts that this user follows, and provide it to the get_average_embeddings function. This function will return the embedding vector for this user.

** This function currently only supports getting a list of user IDs **

In [35]:
sv.get_userid('madonna')

'512700138'

In [39]:
v = sv.get_average_embeddings([sv.get_userid('rihanna'),
                               sv.get_userid('arianagrande'),
                               sv.get_userid('madonna')])


sv.get_similar(v[0])

Unnamed: 0,twitter_id,similarity,screen_name,name,description
0,79293791,0.903579,rihanna,Rihanna,
1,153694176,0.879907,IGGYAZALEA,IGGY AZALEA,None of your business
2,34507480,0.877103,ArianaGrande,Ariana Grande,
3,184910040,0.871745,Adele,Adele,https://t.co/31T2EYGNLy
4,512700138,0.863626,Madonna,Madonna,https://t.co/WwM6FVkZFU
5,268414482,0.858954,MileyCyrus,Miley Cyrus,M-CEO of Twitter 💙🕊
6,100220864,0.858912,BrunoMars,Bruno Mars,Silk Sonic album is available everywhere!
7,274119641,0.857929,VanessaHudgens,Vanessa Hudgens,🔮
8,268439864,0.85452,xtina,Christina Aguilera,🖤🤍 #Suéltame #LaTormenta
9,157140968,0.849562,KendallJenner,Kendall,@drink818 on Instagram and Twitter


## Get similar to multiple users
The function get similar can also get a list of twitter IDs, and will return the most similar list for the average of these users

In [22]:
edu = ['Harvard','MIT','UCLA']
edu_ids = [ sv.get_userid(id) for id in edu]

sports = ['FCBarcelona','ManUtd','realmadrid']
sports_ids = [ sv.get_userid(id) for id in sports]

In [23]:
sv.get_similar(edu_ids).head(3)

Unnamed: 0,twitter_id,similarity,screen_name,name,description
0,18036441,0.877006,Stanford,Stanford University,Stanford is one of the world's leading researc...
1,5694822,0.830344,Princeton,Princeton University,The official Twitter account of Princeton Univ...
2,14884486,0.829955,BrownUniversity,Brown University,Official Twitter feed for Brown University. 🐻


In [24]:
sv.get_similar(sports_ids).head(3)

Unnamed: 0,twitter_id,similarity,screen_name,name,description
0,90836187,0.938566,andresiniesta8,Andrés Iniesta,⚽️ @visselkobe player | Kobe-BCN-Fuentealbilla...
1,213745334,0.937419,LuisSuarez9,Luis Suárez,Club Nacional de Football player. Born in Salt...
2,140750163,0.920962,juanmata8,Juan Mata García,Professional football player. Member of @Commo...


## Get similarity

In [40]:
sv.get_similarity('barackobama', 'realdonaldtrump')

0.4535221

### get similarity for a vector

In [41]:
sv.get_similarity(sv[12], 'realdonaldtrump')

0.46564198

## Arithmetics fun

In [None]:
positive=['woman', 'king'], negative=['man']

In [73]:
ida = sv.get_userid('BarackObama')
idb = sv.get_userid('BillClinton')
idc = sv.get_userid('hillaryclinton')

In [87]:
sv.get_screen_name(sv.sv.wv.most_similar(positive=[sv.get_userid('BarackObama'), sv.get_userid('michelleobama')],
                                         negative=[sv.get_userid('JoeBiden')],
                                         topn=1)[0][0])

'ReginaKing'

In [79]:
sv.get_similar(sv['michelleobama'] - sv['POTUS44'] + sv['HillaryClinton'])

Unnamed: 0,twitter_id,similarity,screen_name,name,description
0,409486555,0.852655,MichelleObama,Michelle Obama,Girl from the South Side and former First Lady...
1,1339835893,0.844063,HillaryClinton,Hillary Clinton,"2016 Democratic Nominee, SecState, Senator, ha..."
2,1093090866,0.768605,FLOTUS44,First Lady- Archived,Office of First Lady Michelle Obama. This is a...
3,402957663,0.766591,PPFA,Planned Parenthood,Hi! We’re America’s most trusted provider of s...
4,2717254872,0.762603,violadavis,Viola Davis,"Academy-Award winning actress, philanthropist,..."
5,325830217,0.758946,VP44,VP Biden (Archived),This is an archive of an Obama Administration ...
6,970207298,0.756475,SenWarren,Elizabeth Warren,"U.S. Senator, Massachusetts. She/her/hers. Off..."
7,937499232,0.754582,Malala,Malala,Advocate for girls’ education & women's equali...
8,964032914626359296,0.752082,cameron_kasky,Cam Kasky,Chief Executive Officer of Lockheed Martin
9,757303975,0.746976,ChelseaClinton,Chelsea Clinton,"Mom of Charlotte, Aidan & Jasper, Married to M..."


# Classification Examples

In [4]:
sv.init_classifier()

2022-10-18 15:19:36.224446: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [16]:
sv.classifier.predict_political(sv['BarackObama'])

('Democrat', 0.9997575581364799)

In [17]:
sv.classifier.predict_political(sv['realdonaldtrump'])

('Republican', 0.9569854736328125)

# Get the embeddings of any user which is not popular

In [5]:
import toml
import tweepy

In [6]:
tweepy_config = toml.load("tweepy.toml")
tweepy_credentials = tweepy_config['credentials']

In [7]:
auth = tweepy.OAuthHandler(tweepy_credentials['consumer_key'], tweepy_credentials['consumer_secret'])
auth.set_access_token(tweepy_credentials['access_token'], tweepy_credentials['access_token_secret'])
api = tweepy.API(auth, proxy="http://proxy-chain.intel.com:911") # optionally add proxy, e.g.: proxy="http://proxy-chain.intel.com:911"

In [9]:
friends = api.get_friend_ids(screen_name="alonzoizner")

In [10]:
nirlotan_embeddings = sv.get_average_embeddings(friends)[0]

In [117]:
sv.get_similar(nirlotan_embeddings)

Unnamed: 0,twitter_id,similarity,screen_name,name,description
0,27530178,0.807304,Pocket,Pocket,Capture content that fascinates you from acros...
1,740983,0.79708,loic,Loic Le Meur,I write and meditate every day as if my life d...
2,10955762,0.796126,petecashmore,Pete Cashmore,👋 Founder @mashable. TIME 100. Inc 30under30. ...
3,28172926,0.795466,WSJTech,WSJ Tech,The Wall Street Journal's (@WSJ) home for glob...
4,15813140,0.794063,GoogleCloudTech,Google Cloud Tech,"Follow along for how-tos, demos, product news,..."
5,20733756,0.793495,MegWhitman,Meg Whitman,
6,42703075,0.792616,cnntech,CNN Tech,All the ways tech impacts your life.
7,14118534,0.792137,sourceforge,SourceForge,Your Trusted Source for Software. @sfnet_ops f...
8,286802800,0.791911,gklst,Geeklist,Claim your username! The 1st Achievement-Based...
9,18164662,0.791208,usatodaytech,USA TODAY Tech,"Breaking tech news, product reviews and more f..."


In [4]:
sv.init_classifier()

2022-10-20 13:35:36.690538: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [11]:
sv.classifier.predict_political(nirlotan_embeddings)

('Democrat', 0.2910704016685486)

In [27]:
sv.sv.wv['']

<gensim.models.keyedvectors.KeyedVectors at 0x104eb9f40>

In [29]:
sv.classifier.predict_political(sv['dianabuttu'])

('Democrat', 0.4005640745162964)

In [13]:
sv.get_similarity(nirlotan_embeddings, sv['NARAL'])

0.648697

In [15]:
sv.get_similarity(nirlotan_embeddings, sv['nrlc'])

0.48921016

In [17]:
sv.get_similarity(nirlotan_embeddings, sv['GiffordsCourage'])

0.59673774

In [18]:
sv.get_similarity(nirlotan_embeddings, sv['GunOwners'])

0.31398708