# PlayHQ Fixture Scraping

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ssardina/tapp-fixture/blob/main/playhq_scrape.ipynb)

This system allows to scrape game fixtures from [PlayHQ](http://playhq.com/) via its Public [API](https://support.playhq.com/hc/en-au/sections/4405422358297-PlayHQ-APIs). It will produce a CSV file ready to be uploaded as Schedule in [TeamApp](https://brunswickmagicbasketball.teamapp.com/).

The *Public* APIs only require a header parameters to get a successful response, which includes the following components:

- `x-api-key` (also referred to as the Client ID) will be provided by PlayHQ when you request access to the public API via their [support page](https://support.playhq.com/hc/en-au) or email support@playhqsupport.zendesk.com. This key can be stored in a file `x_api_key.txt` or it will be asked interactively by the notebook otherwise. In many cases, the feature to create new API credentials is disabled for a user and can only be actioned by a Super Administrator role within the Play HQ portal.
- `x-phq-tenant` usually refers to the sport/association - in this case '`bv`'.


Detailed reference documentation for PlayHQ API can be found [here](https://docs.playhq.com/tech).

**Contact:** Sebastian Sardina (sssardina@gmail.com)

In [None]:
# from IPython.core.interactiveshell import InteractiveShell
# InteractiveShell.ast_node_interactivity = "all"
import pandas as pd
import re
import os
import calendar, datetime
import configparser

# Set-up everything if running in Google Colab
if "COLAB_GPU" in os.environ:
  %pip install pyshorteners
  %pip install coloredlogs
  for f in ['utils.py', 'playhq.py']:
    if not os.path.exists(f):
      !wget "https://raw.githubusercontent.com/ssardina/tapp-fixture/main/{f}"

import utils
import playhq as phq

## 1. Configuration and set-up

We first configure and set-up the application. This means reading configuration variables from a config file and setting the game day.

So, first of all, we specify:

1. Configuration file for the club and season.
2. Game date to scrape.

In [None]:
CONFIG_FILE = 'config_bmc.cfg'

# set the game date
GAME_DATE = utils.next_day(calendar.SATURDAY)
# GAME_DATE = datetime.date(2022, 8, 27)

In [28]:
config = configparser.ConfigParser()
config.read(CONFIG_FILE)

CLUB_NAME = config.get('main','CLUB_NAME')
TIMEZONE = config.get('main','TIMEZONE')
SEASON = config.get('main','SEASON')
OUTPUT_PATH = config.get('main','OUTPUT_PATH')

ORG_ID = config.get('playhq','ORG_ID')
X_TENANT = config.get('playhq','X_TENANT')
X_API_KEY = config.get('playhq','X_API_KEY')
PLAYHQ_SEASON_URL = config.get('playhq', 'PLAYHQ_SEASON_URL')

# Get nice game date format: Saturday August 06, 2022
GAME_DATE_TIMESTAMP = pd.to_datetime(GAME_DATE).tz_localize(TIMEZONE)
GAME_DATE_NAME = GAME_DATE_TIMESTAMP.strftime("%A %B %d, %Y (%Y/%m/%d)")

# Create phq_club object
phq_club = phq.PlayHQ(CLUB_NAME, ORG_ID, X_API_KEY, X_TENANT, TIMEZONE)
PLAYHQ_GAMES_URL = f"https://bv.playhq.com/org/{ORG_ID}/games?date={GAME_DATE_TIMESTAMP.strftime('%Y-%m-%d')}"

print("Club PlayHQ link: ", PLAYHQ_SEASON_URL)


print(f"Sections in file {CONFIG_FILE}:", config.sections())
print("Club name:", CLUB_NAME)
print("Timezone:", TIMEZONE)
print("Organization id:", ORG_ID)
print("X-tenant:", X_TENANT)
print("X-API-KEY:", X_API_KEY)
print("Output path::", OUTPUT_PATH)
print("Game date:", GAME_DATE_NAME)
print("PlayHQ Club fixture:", PLAYHQ_SEASON_URL)
print("PlayHQ Admin games:", PLAYHQ_GAMES_URL)

Club PlayHQ link:  "https://bit.ly/bmbc-s22"
Sections in file config_bmc.cfg: ['main', 'playhq']
Club name: Brunswick Magic Basketball Club
Timezone: Australia/Melbourne
Organization id: 8c4d5431-eaa5-4644-82ac-992abe224b88
X-tenant: bv
X-API-KEY: f5d33c76-f858-49fa-8330-8e0e396219cd
Output path:: fixture/
Game date: Saturday November 12, 2022 (2022/11/12)
PlayHQ Club fixture: "https://bit.ly/bmbc-s22"
PlayHQ Admin games: https://bv.playhq.com/org/8c4d5431-eaa5-4644-82ac-992abe224b88/games?date=2022-11-12


Next configure some variables that will be used later on when generating schedules.

In [None]:
DESC_BYE_TAPP = "Sorry, no game for the team in this round."
DESC_TAPP = """RSVP mandatory for the game.

Opponent: {opponent}
Venue: {venue} ({court})
Address: {address} {address_tips}
Google Maps coord: https://maps.google.com/?q={coord}

- Please ensure you arrive early and ready.
- Remember that shorts should have no pockets, players should not wear bracelets/watch as it is a risk of injury.
- No food in the venue and pickup your rubbish.
- Games will have 2x20 min halves.
- Each team needs to provide a scorer. TMs, please consider a roster.
- Players should not bring balls into the venue - game balls provided by Magic in coach's equipment bag.
- Beginners refs will be wearing green shirts. Please support and respect them through a POSITIVE sideline behaviour.

Check the game in PlayHQ: {url_game}
Check the round in PlayHQ: {url_grade}
All clubs in PlayHQ: PLAYHQ_SEASON_URL
""".replace("PLAYHQ_SEASON_URL", PLAYHQ_SEASON_URL)

## 2. Get upcoming games for club's teams

First, get the teams of the club.

In [10]:
season_id = phq_club.get_season_id(SEASON)
teams_df = phq_club.get_season_teams(season_id)
teams_df

Unnamed: 0,id,name,grade.id,grade.name,grade.url,age
0,fd7754ce-95db-4e6b-9230-e4e4f4b30456,Magic U8 Mixed Purple,4aac5ef6-fba9-482a-a8d5-0bb802af50fc,Saturday U8 Mixed Division 2,https://www.playhq.com/basketball-victoria/org...,8
1,ae5e5655-d374-4bc2-883b-80bb89da69f2,Magic U8 Mixed Gold,4aac5ef6-fba9-482a-a8d5-0bb802af50fc,Saturday U8 Mixed Division 2,https://www.playhq.com/basketball-victoria/org...,8
2,0e21ad4e-27fc-445d-844b-2a477d55f665,Magic U8 Mixed Black,4aac5ef6-fba9-482a-a8d5-0bb802af50fc,Saturday U8 Mixed Division 2,https://www.playhq.com/basketball-victoria/org...,8
3,03efc707-9627-45d8-978b-085df5201f89,Magic U18 Girls Gold,75efe372-bbd0-4bf9-ac6d-629ae546a14d,Saturday U18 Girls Division 1,https://www.playhq.com/basketball-victoria/org...,18
4,f6460436-a741-4b78-96d9-1b3cfe984fb9,Magic U18 Boys Purple,6d022389-498f-459c-a08c-b958cfa1eb2c,Saturday U18 Boys Division 4,https://www.playhq.com/basketball-victoria/org...,18
5,fd2ca9b6-8345-420d-9496-31620a0a1caf,Magic U16 Girls Purple,f336e651-9ad6-46bb-9ec8-7b2b13baf568,Saturday U16 Girls Division 4,https://www.playhq.com/basketball-victoria/org...,16
6,1a66c6a8-89ba-4838-b076-7724a0488675,Magic U16 Girls Gold,f336e651-9ad6-46bb-9ec8-7b2b13baf568,Saturday U16 Girls Division 4,https://www.playhq.com/basketball-victoria/org...,16
7,529ce712-83f0-439e-9d12-a63a6769037f,Magic U16 Boys Purple,cddbe9fa-645e-4703-affa-0837d9d1bc56,Saturday U16 Boys Division 4/5,https://www.playhq.com/basketball-victoria/org...,16
8,9ce81394-0bce-4abc-bf1e-2c59ef687f0a,Magic U16 Boys Gold,11aa6947-5d5f-4c12-8eb8-8ba3ac88ebe6,Saturday U16 Boys Division 1/2,https://www.playhq.com/basketball-victoria/org...,16
9,d1c4722e-7e84-4577-85a0-98707dab65d2,Magic U16 Boys Diamond,5f405813-6588-4876-ae23-fdf5bd0a7f12,Saturday U16 Boys Division 3Res,https://www.playhq.com/basketball-victoria/org...,16


Next, extract all upcoming games for these teams of the club.

In [11]:
upcoming_games_df = phq_club.get_games(teams_df, GAME_DATE_TIMESTAMP)

if upcoming_games_df is not None:
    print(f'There were {upcoming_games_df.shape[0]} games extracted for game day: {GAME_DATE_NAME}')
    upcoming_games_df[phq.GAMES_COLS]
else:
    print("No games for date: ", GAME_DATE_NAME)

2022-11-09 19:10:21 INFO Games extracted for team: Magic U8 Mixed Purple
2022-11-09 19:10:21 INFO Games extracted for team: Magic U8 Mixed Gold
2022-11-09 19:10:22 INFO Games extracted for team: Magic U8 Mixed Black
2022-11-09 19:10:22 INFO Games extracted for team: Magic U18 Girls Gold
2022-11-09 19:10:24 INFO Games extracted for team: Magic U18 Boys Purple
2022-11-09 19:10:24 INFO Games extracted for team: Magic U16 Girls Purple
2022-11-09 19:10:24 INFO Games extracted for team: Magic U16 Girls Gold
2022-11-09 19:10:25 INFO Games extracted for team: Magic U16 Boys Purple
2022-11-09 19:10:25 INFO Games extracted for team: Magic U16 Boys Gold
2022-11-09 19:10:25 INFO Games extracted for team: Magic U16 Boys Diamond
2022-11-09 19:10:26 INFO Games extracted for team: Magic U16 Boys Black
2022-11-09 19:10:26 INFO Games extracted for team: Magic U14 Girls Black
2022-11-09 19:10:27 INFO Games extracted for team: Magic U14 Girls Purple
2022-11-09 19:10:27 INFO Games extracted for team: Magic

There were 27 games extracted for game day: Saturday November 12, 2022 (2022/11/12)


## 3. Convert to TeamApp CSV format

Next, we convert the PlayHQ upcoming games to Teams App format so we can produce a CSV file to be imported into Teams App.

In [12]:
games_tapps_df = utils.to_teamsapp_schedule(upcoming_games_df, desc_template=DESC_TAPP, game_duration=45)
print("Done computing the games for Teams App")

games_tapps_df.sample(3)

Done computing the games for Teams App


Unnamed: 0,event_name,team_name,start_date,end_date,start_time,end_time,description,venue,location,access_groups,rsvp,comments,attendance_tracking,duty_roster,ticketing,opponent,court
10,U16 Boys Black - Round 5,U16 Boys Black,2022-11-12,2022-11-12,14:30:00,15:15:00,RSVP mandatory for the game.\n\nOpponent: Jets...,Northcote High School,"19-25 St Georges Road, Northcote",U16 Boys Black,1,1,0,1,0,Jets U16 Boys Red,Court 1
1,U8 Mixed Gold - Round 5,U8 Mixed Gold,2022-11-12,2022-11-12,08:30:00,09:15:00,RSVP mandatory for the game.\n\nOpponent: Magi...,Coburg North Primary School,"180 OHEA ST, COBURG",U8 Mixed Gold,1,1,0,1,0,Magic U8 Mixed Purple,Court 1
9,U16 Boys Diamond - Round 5,U16 Boys Diamond,2022-11-12,2022-11-12,13:45:00,14:30:00,RSVP mandatory for the game.\n\nOpponent: Rebe...,Glenroy College,"BOX FOREST COLLEGE, 120 GLENROY RD, GLENROY",U16 Boys Diamond,1,1,0,1,0,Rebels U16 Boys Blue,Court 1


Inspect how the description of one of the games will look like:

In [23]:
# Inspect description game of one team
team = "U16 Boys Diamond"

print("Description for:", team)
print(games_tapps_df.query("team_name == @team")['description'].values[0])

Description for: U16 Boys Diamond
RSVP mandatory for the game.

Opponent: Rebels U16 Boys Blue
Venue: Glenroy College (Court 1)
Address: BOX FOREST COLLEGE, 120 GLENROY RD, GLENROY 
Google Maps coord: https://maps.google.com/?q=(-37.7045931,144.9258952)

- Please ensure you arrive early and ready.
- Remember that shorts should have no pockets, players should not wear bracelets/watch as it is a risk of injury.
- No food in the venue and pickup your rubbish.
- Games will have 2x20 min halves.
- Each team needs to provide a scorer. TMs, please consider a roster.
- Players should not bring balls into the venue - game balls provided by Magic in coach's equipment bag.
- Beginners refs will be wearing green shirts. Please support and respect them through a POSITIVE sideline behaviour.

Check the game in PlayHQ: https://tinyurl.com/2dzqzn4w
Check the round in PlayHQ: https://tinyurl.com/247qb967
All clubs in PlayHQ: "https://bit.ly/bmbc-s22"



### Extract BYE games

We now extract the teams for which we couldn't scrape a game. In most cases this means a BYE for those teams.

In [15]:
# Extract the date of the round
# date = team_apps_csv_df.iloc[1]['start_date']
print(f"Extract BYE games for games on {GAME_DATE_NAME}")

playing_teams = upcoming_games_df['team_id'].tolist()
bye_teams = teams_df.loc[~teams_df['id'].isin(playing_teams)]['name'].tolist()
bye_teams = list(map(lambda x: re.search("U.*", x).group(0), bye_teams))

if bye_teams:
    games_bye_df = utils.build_teamsapp_bye_schedule(bye_teams, GAME_DATE)
    print(f"Bye teams ({len(bye_teams)}): ", bye_teams)
else:
    print("No BYE games this round...")

Extract BYE games for games on Saturday November 12, 2022 (2022/11/12)
No BYE games this round...


Finally, put together upcoming games and BYE games in a single table that will later be used to produce a CSV for TeamApp schedule import.

In [24]:
if bye_teams:
    team_apps_csv_df = pd.concat([games_tapps_df, games_bye_df])
    team_apps_csv_df.drop_duplicates(inplace=True)
    team_apps_csv_df.reset_index(inplace=True, drop=True)
else:
    team_apps_csv_df = games_tapps_df

team_apps_csv_df.sample(4)

Unnamed: 0,event_name,team_name,start_date,end_date,start_time,end_time,description,venue,location,access_groups,rsvp,comments,attendance_tracking,duty_roster,ticketing,opponent,court
4,U18 Boys Purple - Round 5,U18 Boys Purple,2022-11-12,2022-11-12,13:45:00,14:30:00,RSVP mandatory for the game.\n\nOpponent: Pant...,Coburg Basketball Stadium,"25 Outlook Road, Coburg North",U18 Boys Purple,1,1,0,1,0,Panthers U18 Boys Lime,Court 4
8,U16 Boys Gold - Round 5,U16 Boys Gold,2022-11-12,2022-11-12,13:00:00,13:45:00,RSVP mandatory for the game.\n\nOpponent: STAR...,Coburg Basketball Stadium,"25 Outlook Road, Coburg North",U16 Boys Gold,1,1,0,1,0,STARS U16 Boys RW,Court 1
18,U12 Girls Gold - Round 5,U12 Girls Gold,2022-11-12,2022-11-12,10:45:00,11:30:00,RSVP mandatory for the game.\n\nOpponent: Jets...,Coburg Basketball Stadium,"25 Outlook Road, Coburg North",U12 Girls Gold,1,1,0,1,0,Jets U12 Girls Blue,Court 4
9,U16 Boys Diamond - Round 5,U16 Boys Diamond,2022-11-12,2022-11-12,13:45:00,14:30:00,RSVP mandatory for the game.\n\nOpponent: Rebe...,Glenroy College,"BOX FOREST COLLEGE, 120 GLENROY RD, GLENROY",U16 Boys Diamond,1,1,0,1,0,Rebels U16 Boys Blue,Court 1


## 5. Save to CSV file for Teams App import

In this section, we will produce the CSV file to be imported into TeamAPP as well as pickle files saving the computed dataframes.

We start by reporting the games to be written into Schedule CSV file and **CHECKING THAT ALL IS GOOD TO GO!**

Particularly, look for games that are schedule but **PENDING** and without all details (time or venue).

In [17]:
team_apps_csv_df.columns
team_apps_csv_df[['team_name', 'opponent', 'start_date', 'start_time', 'venue', 'court']]
# team_apps_csv_df

Unnamed: 0,team_name,opponent,start_date,start_time,venue,court
0,U8 Mixed Purple,Magic U8 Mixed Gold,2022-11-12,08:30:00,Coburg North Primary School,Court 1
1,U8 Mixed Gold,Magic U8 Mixed Purple,2022-11-12,08:30:00,Coburg North Primary School,Court 1
2,U8 Mixed Black,Panthers U8 Mixed,2022-11-12,08:30:00,Coburg Basketball Stadium,Court 3
3,U18 Girls Gold,St Fidelis U18 Girls Blue,2022-11-12,15:15:00,Coburg Basketball Stadium,Court 3
4,U18 Boys Purple,Panthers U18 Boys Lime,2022-11-12,13:45:00,Coburg Basketball Stadium,Court 4
5,U16 Girls Purple,Panthers U16 Girls Green,2022-11-12,13:45:00,Northcote High School,Court 1
6,U16 Girls Gold,Newlands U16 Girls WILDCATS,2022-11-12,13:45:00,Dallas Brooks Community Primary School,Court 1
7,U16 Boys Purple,Newlands U16 Boys TIGERS,2022-11-12,14:30:00,Coburg Senior High School,Court 1
8,U16 Boys Gold,STARS U16 Boys RW,2022-11-12,13:00:00,Coburg Basketball Stadium,Court 1
9,U16 Boys Diamond,Rebels U16 Boys Blue,2022-11-12,13:45:00,Glenroy College,Court 1


We stop the execution here if we are running all Jupyter notebook.

In [None]:
raise SystemExit("Stop right there! Continue below to produce the CSV file if needed.")

### 5.2. Check changes with previous saves

If the schedule was generated before, check if the new one differs with the one saved already.

First, let us define the files that we will save to disk.

In [25]:
import datetime
import os
import shutil

now = datetime.datetime.now() # current date and time
now_str = now.strftime("%Y-%m-%d_%H:%M:%S")
game_date_str = GAME_DATE.strftime('%Y_%m_%d')

file_csv = os.path.join(OUTPUT_PATH, f"schedule-teamsapp-{game_date_str}.csv")
file_upcoming_pkl = os.path.join(OUTPUT_PATH, f"upcoming_games_df-{GAME_DATE.strftime('%Y_%m_%d')}.pkl")
file_team_apps_csv = os.path.join(OUTPUT_PATH, f"team_apps_csv_df-{GAME_DATE.strftime('%Y_%m_%d')}.pkl")

Next, let's check if there was a saved file for the upcoming game day.

In [26]:
cols = ['team_name', 'opponent', 'start_date', 'start_time', 'venue', 'court']

changed_games_df = None
if os.path.exists(file_team_apps_csv):
    print("There was already a schedule saved, recovering it to compare...")
    old_team_apps_csv_df = pd.read_pickle(file_team_apps_csv)

    teams_changed = pd.concat([team_apps_csv_df[cols], old_team_apps_csv_df[cols]]).drop_duplicates(keep=False)['team_name'].unique()
    print("Teams whose games have changed (updated, new, dropped):", teams_changed)

    old_games_df = old_team_apps_csv_df[cols].query("team_name in @teams_changed")
    new_games_df = team_apps_csv_df[cols].query("team_name in @teams_changed")
    changed_games_df = new_games_df.merge(old_games_df, how="inner", on="team_name", suffixes=('_new', '_old'))

# Show changes if any...
(changed_games_df is not None) and changed_games_df

False

### 5.3. Write a TeamAPP Schedule CSV & Datafarmes Pickles

Finally, we save the data to a CSV file that can be imported into the [SCHEDULE of TeamsApp for all Entries](https://brunswickmagicbasketball.teamapp.com/clubs/263995/events?_list=v1&team_id=all).

In [27]:
import datetime
import os
import shutil

now = datetime.datetime.now() # current date and time
now_str = now.strftime("%Y-%m-%d_%H:%M:%S")
game_date_str = GAME_DATE.strftime('%Y_%m_%d')

if not os.path.exists(OUTPUT_PATH):
  os.makedirs(OUTPUT_PATH)

print('Saving TeamAPP schedule CSV file and Dataframes for games:', GAME_DATE_NAME)

file_csv = os.path.join(OUTPUT_PATH, f"schedule-teamsapp-{game_date_str}.csv")
file_upcoming_pkl = os.path.join(OUTPUT_PATH, f"upcoming_games_df-{GAME_DATE.strftime('%Y_%m_%d')}.pkl")
file_team_apps_csv = os.path.join(OUTPUT_PATH, f"team_apps_csv_df-{GAME_DATE.strftime('%Y_%m_%d')}.pkl")

for f in [file_csv, file_upcoming_pkl, file_team_apps_csv]:
  if os.path.exists(f):
    print("Backup file", f)
    shutil.copy(f, f + ".bak")

print('File to save TeamApp schedule:', file_csv)
team_apps_csv_df.to_csv(file_csv, index=False)

print('Saving dataframe pickle:', file_upcoming_pkl)
upcoming_games_df.to_pickle(file_upcoming_pkl)
print('Saving dataframe pickle:', file_team_apps_csv)
team_apps_csv_df.to_pickle(file_team_apps_csv)

print(f"Finished saving csv and data-frame files: {now.strftime('%d/%m/%Y, %H:%M:%S')}")

Saving TeamAPP schedule CSV file and Dataframes for games: Saturday November 12, 2022 (2022/11/12)
File to save TeamApp schedule: fixture/schedule-teamsapp-2022_11_12.csv
Saving dataframe pickle: fixture/upcoming_games_df-2022_11_12.pkl
Saving dataframe pickle: fixture/team_apps_csv_df-2022_11_12.pkl
Finished saving csv and data-frame files: 09/11/2022, 19:22:26


# ------------ END FIXTURE PUBLISHING ------------

### Check a particular team

In [None]:
team = "U10 Girls Gold"

print(games_tapps_df.query("team_name == @team")['description'].values[0])
team_apps_csv_df.query("team_name == @team")[['team_name', 'opponent', 'start_date', 'start_time', 'venue', 'court']]
