Skip to content
Juan-Pablo Velez edited this page Oct 14, 2013 · 23 revisions

GivingGraph is a tool that gathers and merges disparate information about nonprofits from structured, unstructured, and social sources. Then, it constructs a network of nonprofits/companies/people from this information using text mining, and analyzes this graph using social network analysis. Finally, it makes this "giving graph" and analysis available to everyone through an API.

Fetching structured data - basic info on nonprofits

GivingGraph enables you to aggregate basic information about a nonprofit from structured databases such as GuideStar and CharityNavigator, with appropriate API permissions and the organization's nonprofit ID (EIN number).

  • Nonprofit's name, location, description, financials, and issue category (NTEE code).

Gathering unstructured data - nonprofits, and companies

GivingGraph fetches unstructured, natural language data from the web in order to link nonprofits to companies:

  • News articles: GivingGraph uses web search APIs like Yahoo News to gather news stories that mention companies and the nonprofits and causes they support. Currently, company names that match common English words (e.g. "April") are filtered out.
  • Webpages: the Guidestar and CharityNavigator databases often list a nonprofit's homepage. Givinggraph can crawl these webpages and automatically extract company names that pop up - possibly indicating a relationship - using a combination of lists of company names. The company names are found in the database's "companies" table; the table was populated with a list of names scraped from Bloomberg.

Fetching social data - nonprofits and supporters

GivingGraph can also fetch a nonprofit's twitter data:

  • the name of the organization's official twitter account is found by searching Yahoo (via search.py). We simply search Yahoo for "twitter" and the company name, and assume the first result is the official Twitter account.
  • the nonprofit's twitter followers are retrieved via the Twitter API.
  • the nonprofit's tweets are retrieved via the Twitter API.

Graph building - linking everything together

With all these sources of information in hand, we use natural language processing techniques to build a weighted graph that connects nonprofits, companies, and people (twitter followers).

  • Similarity relationships between nonprofits, based on their tweets.
  • Similarity relationships between nonprofits, based on their webpages.

These similarity calculations are done using NLP topic modeling techniques in givinggraph/analysis/similarity.py using the Python gensim library.

  • Follower relationships between twitter users and nonprofits.
  • Relationships between companies and nonprofits, based on news article mentions. News articles are found by searching Yahoo News for the name of a nonprofit. Regular expressions are used to search the text of the discovered news articles for company names used in a supportive context (i.e., the company name and a phrase of support, like "helped" or "donated to" are found near each other). If such a company mention is found, this is considered a relationship between a company and nonprofit.

Graph analysis - understanding supports, recommending partnerships, finding nonprofit communities, understanding connectedness

We then use social network analysis, including community detection algorithms, to recommend company partnerships, discover communities of similar nonprofits, and understand the network structure of those communities.

  • Basic twitter account summary stats, including average retweets, average favorites, and most frequent hastags. This part of the analysis could be developed much further.
  • Companies the nonprofit could partner with, based on firms mentioned in news articles with similar organizations
  • Social network analysis of individual nonprofits: how connected they are to other organizations, etc.
  • Social network analysis of nonprofit verticals: how well connected environmental organizations are, etc.
  • Community detection can be performed via givinggraph/analysis/network_analysis.py, which depends on Thomas Aynaud's community detection algorithm (see givinggraph/analysis/community.py).

Giving graph API - publishing the graph and analysis

In order for the nonprofit-company graph we've constructed and the analysis we're doing on top of it to be useful, we make them available through a simple API. Apps can be built on top of this API that help nonprofits answer key questions, thus alleviating knowledge gaps in the sector.