Skip to content

nirlotan/SocialVec

Repository files navigation

SocialVec

SocialVec is a general framework of Social Embeddings for eliciting social world knowledge from social networks, which was developed by Nir Lotan and Einat Minkov as part of their research, available here: https://arxiv.org/abs/2111.03514

New: SocialVec is now a library you can import and use!

Installation

pip install socialvec

Initialization

Upon initialization, you can either create a new SocialVec instance with the default configuration, or select a specific version of the model. Currenly available version are:

  • SocialVec2020.pkl.gz
  • SocialVec2020_2022.pkl.gz If this is the first time you are using SocialVec and one of these models, the library will download the model binaries to your machine. In following usages, download will not be required, and the loading time will be significantly shorter.
from socialvec.socialvec import SocialVec
sv = SocialVec()

Usage Samples

Basic Usage Examples

Get a vector of a user using twitterid (string or integer), or by username

sv[12]
sv["12"]
sv["jack"]

Get similar users

sv.get_similar('jack')
twitter_id similarity screen_name name description
0 6385432 0.841613 dickc dick costolo nan
1 989 0.831723 om OM Partner emeritus @Trueventures I was a reporte...
2 5746452 0.827466 waltmossberg Walt Mossberg Board, News Literacy Project. Former columnist...
3 20536157 0.826462 Google Google #HeyGoogle
4 6708952 0.819312 SteveCase Steve Case Chairman of @Revolution. Chairman of @CaseFoun...
5 9534522 0.816885 Pogue David Pogue Host of “Unsung Science” podcast; "CBS Sunday ...
6 5763262 0.813040 karaswisher Kara Swisher Mother of (4) Dragons. Future resident of Hawa...
7 14749070 0.808801 Chad_Hurley Chad Hurley Co-Founder, @YouTube; Investor, @Warriors, @LA...
8 22255654 0.805819 johndoerr John Doerr Passionate about moving leaders to act—with sp...
9 37570179 0.804565 arrington Michael Arrington 🏴‍☠️ Founder of TechCrunch, CrunchBase and Arringto...

Get the average embeddings of multiple users

When we want to get the embeddings of a user that is not a popular entity, we collect the list of accounts that this user follows, and provide it to the get_average_embeddings function. This function will return the embedding vector for this user.

** This function currently only supports getting a list of user IDs **

v = sv.get_average_embeddings([1, sv.get_userid('jack')], 989)
sv.get_similar(v[0])
twitter_id similarity screen_name name description
0 12 1.000000 jack jack #bitcoin
1 6385432 0.841613 dickc dick costolo nan
2 989 0.831723 om OM Partner emeritus @Trueventures I was a reporte...
3 5746452 0.827466 waltmossberg Walt Mossberg Board, News Literacy Project. Former columnist...
4 20536157 0.826462 Google Google #HeyGoogle
5 6708952 0.819312 SteveCase Steve Case Chairman of @Revolution. Chairman of @CaseFoun...
6 9534522 0.816885 Pogue David Pogue Host of “Unsung Science” podcast; "CBS Sunday ...
7 5763262 0.813040 karaswisher Kara Swisher Mother of (4) Dragons. Future resident of Hawa...
8 14749070 0.808801 Chad_Hurley Chad Hurley Co-Founder, @YouTube; Investor, @Warriors, @LA...
9 22255654 0.805819 johndoerr John Doerr Passionate about moving leaders to act—with sp...

Get similar for multiple users

The function get similar can also get a list of twitter IDs, and will return the most similar list for the average of these users

edu = ['Harvard','MIT','UCLA']
edu_ids = [ sv.get_userid(id) for id in edu]

sports = ['FCBarcelona','ManUtd','realmadrid']
sports_ids = [ sv.get_userid(id) for id in sports]
sv.get_similar(edu_ids)
twitter_id similarity screen_name name description
0 5695032 0.867065 Yale Yale University News, events and updates from Yale University.
1 5694822 0.861724 Princeton Princeton University The official Twitter account of Princeton Univ...
2 248795646 0.850461 Columbia Columbia University The official Twitter feed of Columbia Universi...
3 14884486 0.845595 BrownUniversity Brown University Official Twitter feed for Brown University. 🐻
4 33474655 0.840983 Cambridge_Uni Cambridge University Research, news and events from the University ...
5 18036441 0.838993 Stanford Stanford University Stanford is one of the world's leading researc...
6 17369110 0.833544 Cornell Cornell University Learning. Discovery. Engagement. Join the #Cor...
7 19606528 0.804404 HarvardHBS Harvard Business School Educating leaders who make a difference in the...
8 48289662 0.795457 UniofOxford University of Oxford Welcome to our official account 👋 Online 9am-5...
9 21226678 0.793840 dartmouth Dartmouth The official Twitter account of Dartmouth Coll...
sv.get_similar(sports_ids)
twitter_id similarity screen_name name description
0 740336334 0.931517 GarethBale11 Gareth Bale Footballer. @LAFC and @FAWales. Instagram - ht...
1 344801362 0.917337 DavidLuiz_4 David Luiz Enjoy the life!\n🔴⚫️💥\nhttps://t.co/6cHcpZY4nc…
2 140750163 0.915364 juanmata8 Juan Mata García Professional football player. Member of @Commo...
3 112764971 0.913976 FCBarcelona_es FC Barcelona #ForçaBarça! ¡Síguenos!: @fcbarcelona_cat @fcb...
4 533085085 0.912526 M10 Mesut Özil Football player @ibfk2014 ⚽️ | Co-Founder @Uni...
5 265982289 0.911782 D_DeGea David de Gea ⚽ Goalkeeper @ManUtd 🇪🇸 International with @Se...
6 1964571728 0.899911 Benzema Karim Benzema Football player - @equipedefrance @realmadrid ...
7 366592246 0.899444 hazardeden10 Eden Hazard Belgium 🇧🇪
8 185827887 0.898743 cesc4official Cesc Fàbregas Soler Proud dad of 5 beautiful children. 35 years ol...
9 213745334 0.895597 LuisSuarez9 Luis Suárez Club Nacional de Football player. Born in Salt...

SocialVecClassifier

Initialization

SocialVecClassifier is part of the socialvec package, so no additional installation is needed, however you need to initiate it seperately after creating the SocialVec object:

# create a SocialVec object as decribed above
from socialvec.socialvec import SocialVec
sv = SocialVec()

#init the classifier
sv.init_classifier()

Usage Samples

Get political classification for a user, using its SocialVec vector:

# The classifier gets a SocialVec embedding vector as input, e.g.:
sv.classifier.predict_political( sv['JoeBiden'] )

#or:
sv.classifier.predict_political( sv['realDonaldTrump'] )

predict_political will return a Republican/Democrat classification, including confidence interval between 0 to 1, where 1 is high confidence, and 0 is no confidence (which may be expected for non-politically affiliated entities)

About

Social Embeddings

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published