# Steps

1. Collect data
2. Generate JSON of integrated subset from all sources
3. Convert to final forms

  a. RSS feed
  b. HTML page

# Getting the data

## Indiegogo

There's no API, so we'll need to scrape the HTML

### Libraries used

In [7]:
!pip install requests beautifulsoup4
!pip install requests

Cleaning up...
Cleaning up...


In [3]:
import bs4
import requests

### Download

In [4]:
resp = requests.get('https://www.indiegogo.com/explore?filter_title=dayton')

### Parse

The BeautifulSoup library converts a raw string of HTML into a highly searchable object

In [5]:
soup = bs4.BeautifulSoup(resp.text)

inspecting the HTML, it looks like each project is described in a `div` of class `i-project-card`.  For example:

In [19]:
proj0 = soup.find_all('div', class_='i-project-card')[0]
proj0

<div class="i-project-card ">
<a class="i-category-header" href="/explore/music">
<span class="i-icon i-category-icon i-glyph-icon-22-music"></span>
<span>Music</span>
</a>
<a class="i-project" href="/projects/sean-dayton-psalms-ep/piad?sa=0&amp;sp=0">
<div class="i-img" data-src="https://res.cloudinary.com/indiegogo-media-prod-cld/image/upload/c_fill,h_220,w_220/v1407165049/grmirid418gsa0lwwwyq.jpg">
</div>
<div class="i-content">
<div class="i-title">Sean Dayton :: Psalms EP</div>
<div class="i-tagline ">Help Sean Dayton record a Psalms EP. Pre-order the album and get some great rewards!</div>
</div>
<div class="i-stats">
<span class="currency currency-medium"><span>$4,000</span><em>CAD</em></span>
<div class="i-progress-bar">
<div class="i-complete" style="width: 100%"></div>
</div>
<div class="i-bottom-row">
<div class="i-percent">
            100%
          </div>
<div class="i-time-left">
            0 time left
          </div>
</div>
</div>
</a></div>

We may want to drill into each individual project page for more details.

In [23]:
detail_url = proj0.find('a', class_='i-project')
detail_url['href']

'/projects/sean-dayton-psalms-ep/piad?sa=0&sp=0'

In [29]:
detail_url = 'https://www.indiegogo.com' + detail_url['href']
detail_url

'https://www.indiegogo.com/projects/sean-dayton-psalms-ep/piad?sa=0&sp=0'

In [26]:
detail_resp = requests.get('https://www.indiegogo.com' + detail_url['href'])

In [28]:
detail_soup = bs4.BeautifulSoup(detail_resp.text)
detail_soup

<!DOCTYPE html>

<html>
<head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# indiegogo: http://ogp.me/ns/fb/indiegogo#">
<meta charset="utf-8"/>
<script type="text/javascript">window.NREUM||(NREUM={});NREUM.info={"beacon":"bam.nr-data.net","errorBeacon":"bam.nr-data.net","licenseKey":"52ee8e3327","applicationID":"265213","transactionName":"J1gLEkINVVQDRUkTSgtdAAVEERZLDlgR","queueTime":2,"applicationTime":341,"agent":"js-agent.newrelic.com/nr-632.min.js"}</script>
<script type="text/javascript">(window.NREUM||(NREUM={})).loader_config={xpid:"UQEDVFdACgUFVlBR"};window.NREUM||(NREUM={}),__nr_require=function(t,e,n){function r(n){if(!e[n]){var o=e[n]={exports:{}};t[n][0].call(o.exports,function(e){var o=t[n][1][e];return r(o?o:e)},o,o.exports)}return e[n].exports}if("function"==typeof __nr_require)return __nr_require;for(var o=0;o<n.length;o++)r(n[o]);return r}({QJf3ax:[function(t,e){function n(t){function e(e,n,a){t&&t(e,n,a),a||(a={});for(var c=s(e),f=c.length,u=i(a,o,r),d=0;f>d;

## Kickstarter 

There's an undocumented API that can give us JSON.

In [13]:
kicks_raw = requests.get('http://www.kickstarter.com/projects/search.json?search=&term=dayton')

In [14]:
import json

In [15]:
data = json.loads(kicks_raw.text)

In [18]:
data['projects'][0]

{'backers_count': 32,
 'blurb': 'Twist Cupcakery needs your help launching our gourmet Cupcakery in Downtown Dayton',
 'category': {'id': 312,
  'name': 'Restaurants',
  'parent_id': 10,
  'position': 9,
  'slug': 'food/restaurants',
  'urls': {'web': {'discover': 'http://www.kickstarter.com/discover/categories/food/restaurants'}}},
 'country': 'US',
 'created_at': 1426798848,
 'creator': {'avatar': {'medium': 'https://ksr-ugc.imgix.net/avatars/14582541/20140223_172627_1_(2).original.jpg?v=1429873398&w=160&h=160&fit=crop&auto=format&q=92&s=4fc3285d6477ec6898bd29151e22b3a2',
   'small': 'https://ksr-ugc.imgix.net/avatars/14582541/20140223_172627_1_(2).original.jpg?v=1429873398&w=80&h=80&fit=crop&auto=format&q=92&s=3e14b82a86a7442f811eb3868276081d',
   'thumb': 'https://ksr-ugc.imgix.net/avatars/14582541/20140223_172627_1_(2).original.jpg?v=1429873398&w=40&h=40&fit=crop&auto=format&q=92&s=f931e8f349ac2ec62d51a9f5b29c0be5'},
  'id': 1661945159,
  'name': "Alexandra 'Kate' Rivers",
  'urls