# API Data Schema
Geoff Pidcock | 20190526

## Scope
Use the Python Requests library to extract information about event listings, and determine an appropriate schema.

### TODO
- Pull sample dataset from Eventbrite, and double check schema

In [1]:
# See environment.yml for setup instructions
import sys
import os
import pandas as pd
import json
import requests

In [2]:
# Get API keys from the following places:
## Eventbrite: https://www.eventbrite.com.au/account-settings/apps
## Meetup: https://secure.meetup.com/meetup_api/key/
# This has been stored in a creds python file above the parent directory
print(os.getcwd())
os.chdir('..//..//')
print(os.getcwd())

C:\Users\Geoff Pidcock\Anaconda_Projects\Other Projects\events-aggregator\events-scraper\notebooks
C:\Users\Geoff Pidcock\Anaconda_Projects\Other Projects\events-aggregator


In [3]:
from creds import meetup_api_key, eventbrite_api_key_public, eventbrite_api_key_private, eventbrite_api_client_secret

## Pull Data from Meetup API
[Docs](https://www.meetup.com/meetup_api/docs/)

In [4]:
# https://www.meetup.com/meetup_api/docs/find/upcoming_events/
# Todo - look into topic categories for smarter filtering - https://www.meetup.com/meetup_api/docs/find/topic_categories/
payload = dict(
    text = 'Data',
    lat = '-33.87',
    lon = '151.21',
    radius = '100',
    page = '100',
    order = 'time',
    fields = 'featured_photo,key_photo,key_photo,meta_category,group_category',
    key = meetup_api_key
)
attempt = requests.get("https://api.meetup.com/find/upcoming_events",params=payload)
data = attempt.json()
len(data['events'])

82

In [5]:
data['city']

{'id': 1000653,
 'city': 'Sydney',
 'lat': -33.87,
 'lon': 151.21,
 'state': '',
 'country': 'au',
 'zip': 'meetup1',
 'member_count': 21737}

In [6]:
i = 0
for event in data['events']:
    print("index: {} | name: {}".format(i,event['name']))
    i += 1

index: 0 | name: PPC Event - Head of Digital Marketing @ Finder, Founder Triggr AI, Ex Googler.
index: 1 | name: Build an Open Data API using Flask-RESTPlus
index: 2 | name: Internet of Things Solution rollouts | #DisruptorsInTech Sydney
index: 3 | name: What You Need to Know about Omni-channel Marketing
index: 4 | name: June: Detecting Online Credit Card Fraud with Apache Kafka Streams
index: 5 | name: Got an Idea? We’ll help you convert it into a startup
index: 6 | name: Time Series: A Conference Honouring Prof William Dunsmuir
index: 7 | name: Australian Skeptics in the Pub - METAFACT: FACT CHECKING BY VERIFIED EXPERTS
index: 8 | name: Digital Village Meetup (Networking)
index: 9 | name: SydCSS June
index: 10 | name: ANZ Workshop Tour - Sydney
index: 11 | name: Autonomous Database Cloud Diagnosability using Machine Learning
index: 12 | name: Know Yourself. Know life
index: 13 | name: #136 [All] A Cryptoguy, an Accountant, and a Buidler walks into a bar....
index: 14 | name: Defendin

In [7]:
data['events'][52]
# 48, 52

{'created': 1559445756000,
 'duration': 7200000,
 'id': '261968693',
 'name': '#20 Turnkey chatbots and privacy-preserving data synthesis!',
 'date_in_series_pattern': False,
 'status': 'upcoming',
 'time': 1561449600000,
 'local_date': '2019-06-25',
 'local_time': '18:00',
 'updated': 1559480853000,
 'utc_offset': 36000000,
 'waitlist_count': 0,
 'yes_rsvp_count': 28,
 'venue': {'id': 24458552,
  'name': 'Tyro FinTech Hub - Events Space',
  'lat': -33.86800765991211,
  'lon': 151.2050018310547,
  'repinned': True,
  'address_1': 'Level 5, 155 Clarence Street',
  'city': 'Sydney',
  'country': 'au',
  'localized_country_name': 'Australia'},
 'group': {'created': 1453958357000,
  'name': 'Sydney Natural Language Processing Meetup',
  'id': 19450519,
  'join_mode': 'open',
  'lat': -33.869998931884766,
  'lon': 151.2100067138672,
  'urlname': 'Sydney-Natural-Language-Processing-Meetup',
  'who': 'Humans',
  'localized_location': 'Sydney, Australia',
  'state': '',
  'country': 'au',
  'r

## Meetup Findings
- Not sure if list is comprehensive - might make sense to assemble a list of data related groups, and then iterate through each 
- Will need to process datetimes from epoc to calendar before writing to database

In [8]:
# DB Schema - Postgres 9.3+
sql = """
CREATE TABLE IF NOT EXISTS raw_events(
    event_id SERIAL PRIMARY KEY,
    source_platform_id CHAR VARYING(20)
    source_platform CHAR VARYING(50),
    event_city CHAR VARYING(50),
    event_date_local DATE,
    event_time_local TIME,
    event_name TEXT,
    event_organiser TEXT,
    event_location_name TEXT,
    event_location_address TEXT,
    event_lat NUMERIC(20,16),
    event_lon NUMERIC(20,16),
    registration_link TEXT,
    description TEXT,
    event_or_group_photo Text,
    event_category CHAR VARYING(100),
    event_category CHAR VARYING(100)
)
"""

In [9]:
# Response Schema based on Meetup
[
    {
        'event_id': '1',
        'source_platform_id': '261218488',
        'source_platform': 'Meetup',
        'event_city': 'Sydney',
        'event_date': '2019-06-14',
        'event_time': '07:30',
        'event_name': 'GA & Data Science Breakfast Meetup presents: The Rise of Automation',
        'event_organiser': 'Data Science Breakfast Meetup',
        'event_location_name': 'GA Sydney (Main Campus)',
        'event_location_address': 'The Podium Building, 1 Market St Sydney AU',
        'event_lat': '-33.869998931884766',
        'event_lon': '151.20460510253906',
        'registration_link': 'https://www.meetup.com/The-Sydney-Data-Science-Breakfast-Meetup-Group/events/261218488/',
        'description': '<p>Note: this is a partnered event with General Assembly. Please make sure to register using the General Assembly event page, linked here: <a href="https://ga.co/2PPoZWd" class="linkified">https://ga.co/2PPoZWd</a></p> <p>****<br/>Abstract:<br/>There has been a lot of attention in the media surrounding the rise of automation. As advanced technologies such as robotic process automation, machine learning, and artificial intelligence have matured, companies have found practical applications to these new technologies.</p> <p>As part of this evolution, Intelligent Automation has become a topic of interest for business leaders across industries looking to combine cognitive capabilities with robotic process technologies to create a "living" system that can go beyond mundane and repeatable tasks. This is extremely advantageous to any organization that can implement these systems seamlessly but there is still a delta between the ideation of AI integration. and the ability to put these plans into effect. This delta can instill fear and anxiety around using AI as well as the added question of the ethical implications of AI such as Facial Recognition and targeted advertising.</p> <p>Join General Assembly and the Data Science Breakfast Meetup as we present a panel of experts at the forefront of AI incorporation for an engaging conversation that will touch on.</p> <p>*****<br/>Agenda:<br/>07:30 - arrival and networking<br/>07:50 - panel kick off<br/>08:30 - panel Q&amp;A<br/>09:00 - more networking (and anyone who needs to head off can leave)<br/>09:30 - close</p> <p>*****<br/>Speaker BIO\'s:</p> <p>Anthony Tockar (Moderator):<br/>Anthony Tockar is director and cofounder at Verge Labs, a new type of AI company focused on the applied side of machine learning. A jack-of-all-trades, he has worked on problems across insurance, technology, telecommunications, loyalty, sports betting and even neuroscience. He qualified as an actuary, then moved into data science, completing an MS in Analytics at the prestigious Northwestern University.</p> <p>After hitting the headlines with his posts on data privacy at Neustar, he returned to Sydney to practice as a data scientist and cofounded the Minerva Collective, a not-for-profit focused on using data for social good, as well as multiple meetup groups. His key missions are to extend the reach and impact of data science to help people, and to assist Australian businesses to become more data driven.</p> <p>Sam Zheng (Panelist):<br/>Sam is Co-founder/CEO of Curious Thing - a voice-based AI interviewer for talent acquisition startup. Sam is a tech entrepreneur, self-taught engineer, and qualified actuary. Before Curious Thing, Sam was Co-founder/CTO of Hyper Anna, an AI for business analytics startup.</p> <p>Dima Galat (Panelist):<br/>Dima learned to program in Assembly on an i486, back when disk sizes were measured in megabytes. He always saw programming as a tool for facilitating communication between disparate data sources and end users.</p> <p>After his first encounter with data mining a decade ago, he became obsessed with applied machine learning, which supercharges this communication process. He has a background in computer vision productisation, data engineering, and a variety of analytics projects for clients ranging from financial institutions to United Nations.</p> <p>Usman Shahbaz (panelist):<br/>Usman is an experienced leader with more than 14 years of rich experience in applying product, network, risk-assurance and consumer analytics to drive actionable business outcomes. His core specialties include Advanced Analytics, Machine &amp; Deep Learning, Statistical Modelling and Optimisation. Usman is currently enrolled for a PhD in Machine Learning. He also holds an MBA and a Bachelor’s degree in Electrical Engineering.</p> <p>Passiona Cottee (panelist):<br/>Data Scientist, Commonwealth Bank, Co-lecturer at UTS</p> ',
        'event_or_group_photo': 'https://secure.meetupstatic.com/photos/event/c/6/6/5/highres_481070789.jpeg',
        'event_format': 'panel',
        'event_category': 'automation'       
    }
]

[{'event_id': '1',
  'source_platform_id': '261218488',
  'source_platform': 'Meetup',
  'event_city': 'Sydney',
  'event_date': '2019-06-14',
  'event_time': '07:30',
  'event_name': 'GA & Data Science Breakfast Meetup presents: The Rise of Automation',
  'event_organiser': 'Data Science Breakfast Meetup',
  'event_location_name': 'GA Sydney (Main Campus)',
  'event_location_address': 'The Podium Building, 1 Market St Sydney AU',
  'event_lat': '-33.869998931884766',
  'event_lon': '151.20460510253906',
  'registration_link': 'https://www.meetup.com/The-Sydney-Data-Science-Breakfast-Meetup-Group/events/261218488/',
  'description': '<p>Note: this is a partnered event with General Assembly. Please make sure to register using the General Assembly event page, linked here: <a href="https://ga.co/2PPoZWd" class="linkified">https://ga.co/2PPoZWd</a></p> <p>****<br/>Abstract:<br/>There has been a lot of attention in the media surrounding the rise of automation. As advanced technologies such 

## Pull Data from Eventbrite
Involves oath2 to get an access token. Sorting out token flow is WIP <br>
*References:*
- [Requests Backend application flow](https://requests-oauthlib.readthedocs.io/en/latest/oauth2_workflow.html#backend-application-flow)
- [EventBrite API Docs](https://www.eventbrite.com/platform/api)

In [10]:
# https://pypi.org/project/python-oauth2/
import oauth2

In [11]:
# See if a similar flow to twitter works
def oauth_req(url, key, secret, http_method="GET", post_body="", http_headers=None):
    consumer = oauth2.Consumer(key=CONSUMER_KEY, secret=CONSUMER_SECRET)
    token = oauth2.Token(key=key, secret=secret)
    client = oauth2.Client(consumer, token)
    resp, content = client.request( url, method=http_method, body=post_body, headers=http_headers )
    return content

In [12]:
url_ = "https://www.eventbrite.com/oauth/authorize"
test = oauth_req(url_,eventbrite_api_key_public,eventbrite_api_client_secret)
# Fail

AttributeError: module 'oauth2' has no attribute 'Consumer'