Graph Hack at Graph Connect SFO 2016
Latest commit 354d301 Oct 12, 2016 @ryguyrg ryguyrg committed on GitHub Update README.md
Permalink
Failed to load latest commit information.
img
README.md

README.md

Graph Hack 2016

TLDR

  • Build something cool with Neo4j
  • Register your project and team here
  • Present your project
  • Win cool prizes

Overview

WIFI

Network: g|Events Passsword: machinelearning

Schedule

4:00pm-4:30pm Arrive
4:30pm-5:00pm Project pitches
5:00pm-5:30pm Teams form
5:30pm-9:00pm Hacking
6:00pm Food and Drinks served
9:00pm-9:45pm Project presentations
10:00pm Prize announcements
10:00pm Bar closes (keynote is early!)

Rules

Teams!

  • Teams are encouraged, but individual participation is allowed.
  • Teams may have up to 5 participants, but only 4 prizes are awarded per team.
  • Register your project and all team mates here

Winners

  • 1st place team
  • 2nd place team
  • 3rd place team
  • Honorable mentions team(s)

Criteria for judging - all projects must use Neo4j

  • Creativity (perhaps including merging disparate data sources)
  • Use of Cypher and Neo4j
  • Interesting new information uncovered
  • Quality of presentation (perhaps including visualization)
  • Completeness

Prize Distribution

  • A variety of prizes will be on display throughout the event
  • Each member of winning teams may choose (1) prize
  • First place team will be announced and choose prizes first, followed by 2nd place team, followed by 3rd place team
  • As long as prize inventory allows, honorable mention teams will be announced and be able to choose prizes
  • Each prize is limited in quanity. Within each winning team, individuals will need to decide who gets which prize.

Prizes

Datasets

Panama Papers + (Offshore Leaks + Bahamas Leaks)

NOTE: This dataset has an interactive Neo4j Browser guide for exploring the data:

Legis-graph (US Congress)

Campaign Finance - FEC Filings

  • :play http://guides.neo4j.com/legisgraph/fecimport.html

Combined US Congress + FEC

US Election Data

Election Tweets

Download

wget http://demo.neo4j.com.s3.amazonaws.com/electionTwitter/neo4j-election-twitter-demo.tar.gz
tar -xvzf neo4j-election-twitter-demo.tar.gz
cd neo4j-enterprise-3.0.3
bin/neo4j start

*Or use hosted instance

NOTE: This dataset has an interactive Neo4j Browser guide for exploring the data:

Election Forecast

Fivethirtyeight has made the data behind their famous election forecast publicly available:

http://projects.fivethirtyeight.com/2016-election-forecast/summary.json

You can easily pull this into Neo4j using apoc.load.json:

CALL apoc.load.json("http://projects.fivethirtyeight.com/2016-election-forecast/summary.json") YIELD value AS data
RETURN data

Hillary Clinton's Emails

// Creating the graph
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "https://s3-us-west-2.amazonaws.com/neo4j-datasets-public/Emails-refined.csv" AS line
MERGE (fr:Person {alias: COALESCE(line.MetadataFrom, line.ExtractedFrom,'')})
MERGE (to:Person {alias: COALESCE(line.MetadataTo, line.ExtractedTo, '')})
MERGE (em:Email { id: line.Id })
ON CREATE SET em.foia_doc=line.DocNumber, em.subject=line.MetadataSubject, em.to=line.MetadataTo, em.from=line.MetadataFrom, em.text=line.RawText, em.ex_to=line.ExtractedTo, em.ex_from=line.ExtractedFrom
MERGE (to)<-[:TO]-(em)-[:FROM]->(fr)
MERGE (fr)-[r:HAS_EMAILED]->(to)
ON CREATE SET r.count = 1
ON MATCH SET r.count = r.count + 1;
// Updating counts
MATCH (a:Person)-[r]-(b:Email) WITH a, count(r) as count SET a.count = count;

Fivethirtyeight

The Fivethirtyeight teams does an amazing job of providing the data behind many of their stories in their Github repo. There are a lot of possibilities but here are a few ideas we hacked up:

hip-hop-candidate-lyrics

Import

LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/fivethirtyeight/data/master/hip-hop-candidate-lyrics/genius_hip_hop_lyrics.csv" AS row
MERGE (c:Candidate {name: row.candidate})
MERGE (a:Artist {name: row.artist})
MERGE (s:Sentiment {type: row.sentiment})
MERGE (t:Theme {type: row.theme})
MERGE (song:Song {name: row.song})
MERGE (line:Line {text: row.line})
SET line.url = row.url
MERGE (line)-[:MENTIONS]->(c)
MERGE (line)-[:HAS_THEME]->(t)
MERGE (line)-[:HAS_SENTIMENT]->(s)
MERGE (song)-[:HAS_LINE]->(line)
MERGE (a)-[r:PERFORMS]->(song)
SET r.data = row.album_release_date

Crime data

Many governments use the Socrata data portal software to make their data (i.e. crime, transportation, etc) available. This means that we can use apoc.load.json to import data directly from any Socrata site. For example, to import San Francisco crime data:

CALL apoc.load.json("https://data.sfgov.org/resource/cuks-n6tp.json?$limit=5000&$offset=0") YIELD value AS crime
MERGE (c:Crime {incidntnum: crime.incidntnum})
ON CREATE SET c.address=crime.address, c.time=crime.time, c.dayofweek=crime.dayofweek
MERGE (cat:Category {name: crime.category})
CREATE (c)-[:HAS_CATEGORY]->(cat)
MERGE (dis:District {name: crime.pddistrict})
CREATE (c)-[:OCCURRED_IN]->(dis);

Other resources

Beyond the resources listed above.

We don't have Neo4j import scripts or graph exports for these, but we think they might be interesting to explore:

Resources

Neo4j

You'll need to use Neo4j to participate in the hackathon. You can download Neo4j here or use one of the hosted versions above.

APOC - Awesome Procedures on Cypher

Graph algorithms, data import, job scheduling, full text search, geospatial, ...

Neo4j Folks

Grab your friendly Neo4j staff and community members if you have any questions.