-
Notifications
You must be signed in to change notification settings - Fork 13
Data
JohnHBrock edited this page Sep 9, 2013
·
9 revisions
GivingGraph collects data from several sources and stores them in a MySQL DB.
-
GuideStar: (Requires API credentials) Given an EIN, we query GuideStar for NTEE code, mission statement, annual revenue, and year founded. Results saved to the
nonprofits
table. - Charity News: Given a nonprofit name, we search the Yahoo News API for related news articles. (See searcher.py.) Results saved to the `news_articles' table.
-
Company Mentions: For each news article collected above, search for mentions of companies. (See parser.py.) Results saved to the
news_articles_companies_rel
table. -
Twitter Handle: Yahoo is queried for the Twitter handle of this nonprofit. Result saved to the
nonprofits
table. -
Tweets: The most recent tweets from this nonprofit are collected and stored in
nonprofits_tweets
. -
Followers: The list of Twitter users who follow this nonprofit are collected and stored in
nonprofits_followers
.
After these data are collected, a similarity score is computed for each pair of nonprofits based on each data source:
-
tweet similarity is stored in
nonprofits_similarity_by_tweets
. -
description similarity is stored in
nonprofits_similarity_by_description
. -
follower similarity is stored in
nonprofits_similarity_by_tweet_ids
.
Once these similarity scores have been computed, nonprofits are then clustered into communities, with the results stored in nonprofits_communities_by_description
, nonprofits_communities_by_tweets
, nonprofits_communities_by_tweet_words
.
The results of our analysis are exposed through a REST API.