# Scraping the PyOhio Schedule
The twelfth annual [PyOhio conference](https://www.pyohio.org/2019/) was held on July 27-28 and...it. was. awesome!

Now, when it comes to planning for a conference, I must admit that I'm a bit "old school."  A day or two before the gathering, I like to print out the schedule and carefully research each session so that I can choose the ones that best meet my work and personal objectives.  Often, a conference will let you download a printable schedule; however, I didn't find any such file on the [PyOhio website](https://www.pyohio.org/2019/).  No matter, I can write some Python to scrape the schedule from the website and create my own CSV for printing.  Here's what I did:

### Import [requests](https://2.python-requests.org//en/v0.10.6/), [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/), and [csv](https://docs.python.org/3/library/csv.html)

In [1]:
import requests
from bs4 import BeautifulSoup
import csv

### Grab the page with requests

In [2]:
result = requests.get('https://www.pyohio.org/2019/events/schedule/')
soup = BeautifulSoup(result.content, 'lxml')

### Parse the page with BeautifulSoup
Unfortunately, I can only go the conference on Saturday, so I'll just grab the Saturday sessions.

In [3]:
day_2_list = [['start_end', 'slot1', 'slot2', 'slot3', 'slot4']]
day_2 = soup.select('div.day')[1]  # get just Saturday
talks_section = day_2.find('h3', string='Keynotes, Talks, & Tutorials').parent

# iterate across each time block
for time_block in talks_section.select('div.time-block'):
    start_end = time_block.find('div', {'class': 'time-wrapper'}).get_text().replace('to', ' - ')
    time_rec = [start_end, '', '', '', '']
    # now, iterate across each slot within a time block.  a time block can have 1-4 time slots
    for slot in time_block.select('div.time-block-slots'):
        for i, card in enumerate(slot.select('div.schedule-item')):
            class_title = card.select_one('h3').get_text()
            presenter = (card.select('p')[0]).get_text()
            location = (card.select('p')[1]).get_text()
            time_rec[i+1] = '{0}\n{1}\n{2}'.format(class_title, presenter, location)
    day_2_list.append(time_rec)  # after grabbing each slot, write the block to my "day 2" list

### Finally, write my list to a CSV file

csv_file = 'pyohio_20190727_schedule.csv'

with open(csv_file, 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(day_2_list)

### For convenience, go ahead and open my CSV
I have Excel installed on my workstation, so the CSV will open automatically there and I can format it nicely for printing.

In [6]:
!{csv_file}