# PlayHQ Fixture Scraping

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ssardina/tapp-fixture/blob/main/playhq_scrape.ipynb)

This system allows to scrape game fixtures from [PlayHQ](http://playhq.com/
) via its Public [API](https://support.playhq.com/hc/en-au/sections/4405422358297-PlayHQ-APIs). 

It will produce a CSV file ready to be uploaded as Schedule in [TeamApp](https://brunswickmagicbasketball.teamapp.com/).

The *Public* APIs only require a header parameters to get a successful response, which includes `x-api-key` (also referred to as the Client ID) and `x-phq-tenant` (refers to the sport/association - in this case `bv`). Note that the Private APIs are not available to clubs and associations.

Detailed reference documentation for PlayHQ API can be found [here](https://docs.playhq.com/tech).

Contact: Sebastian Sardina (sssardina@gmail.com)

In [1]:
# from IPython.core.interactiveshell import InteractiveShell
# InteractiveShell.ast_node_interactivity = "all"

import pandas as pd
import re
import os
import calendar, datetime

# Set-up everything if running in Google Colab
if "COLAB_GPU" in os.environ:
  %pip install pyshorteners
  %pip install coloredlogs
  for f in ['utils.py', 'playhq.py', 'config.py']:
    if not os.path.exists(f):
      !wget "https://raw.githubusercontent.com/ssardina/tapp-fixture/main/{f}"

from config import *
import utils
import playhq as phq

## 1. Set-up application

First, creating a connection to the PlayHQ Public API. Remember to set-up file `config.py` for club and season configuration.


The *Public* APIs only require the below header parameters to get a successful response:

- `x-api-key` (also referred to as the Client ID) will be provided by PlayHQ when you request access to the public API via their [support page](https://support.playhq.com/hc/en-au) or email support@playhqsupport.zendesk.com. This key can be stored in a file `x_api_key.txt` or it will be asked interactively by the notebook otherwise. In many cases, the feature to create new API credentials is disabled for a user and can only be actioned by a Super Administrator role within the Play HQ portal.
- `x-phq-tenant` usually refers to the sport/association - in this case '`bv`'.

The game date (`GAME_DATE`) is assumed to be the upcoming Saturday, but a specific day can be set here by uncommenting and editing the second line.

In [2]:
GAME_DATE = utils.next_day(calendar.SATURDAY) # get the date of the upcoming Saturday (game day in competition)
# GAME_DATE = datetime.date(2022, 8, 27)  # can also fix a particular game day
GAME_DATE_TIMESTAMP = pd.to_datetime(GAME_DATE).tz_localize(TIMEZONE)
GAME_DATE_NAME = GAME_DATE_TIMESTAMP.strftime("%A %B %d, %Y (%Y/%m/%d)") # Saturday August 06, 2022

# if no x-api-key is defined, ask the user for it
if X_API_KEY is None:
  X_API_KEY = input("Enter your x-api-key:")
print("x-api-key is now defined!")

phq_club = phq.PlayHQ(CLUB_NAME, ORG_ID, X_API_KEY, X_TENANT, TIMEZONE)
print(f"Set-up games for club {CLUB_NAME} for upcoming Saturday is: {GAME_DATE_NAME}")

PLAYHQ_URL=f"https://bv.playhq.com/org/{ORG_ID}/games?date={GAME_DATE_TIMESTAMP.strftime('%Y-%m-%d')}"
print("Game day at PlayHQ:", PLAYHQ_URL)
print("Club PlayHQ link: ", PLAYHQ_CLUB_SEASON)

x-api-key is now defined!
Set-up games for club Brunswick Magic Basketball Club for upcoming Saturday is: Saturday October 15, 2022 (2022/10/15)
Game day at PlayHQ: https://bv.playhq.com/org/8c4d5431-eaa5-4644-82ac-992abe224b88/games?date=2022-10-15
Club PlayHQ link:  https://bit.ly/bmbc-s22


Now get the teams of the club; it will be used below.

In [3]:
season_id = phq_club.get_season_id(SEASON)
teams_df = phq_club.get_season_teams(season_id)
teams_df

Unnamed: 0,id,name,grade.id,grade.name,grade.url,age
0,fd7754ce-95db-4e6b-9230-e4e4f4b30456,Magic U8 Mixed Purple,4aac5ef6-fba9-482a-a8d5-0bb802af50fc,Saturday U8 Mixed Division 2,https://www.playhq.com/basketball-victoria/org...,8
1,ae5e5655-d374-4bc2-883b-80bb89da69f2,Magic U8 Mixed Gold,4aac5ef6-fba9-482a-a8d5-0bb802af50fc,Saturday U8 Mixed Division 2,https://www.playhq.com/basketball-victoria/org...,8
2,0e21ad4e-27fc-445d-844b-2a477d55f665,Magic U8 Mixed Black,4aac5ef6-fba9-482a-a8d5-0bb802af50fc,Saturday U8 Mixed Division 2,https://www.playhq.com/basketball-victoria/org...,8
3,03efc707-9627-45d8-978b-085df5201f89,Magic U18 Girls Gold,49e74fe2-686b-478c-a22f-b35cbc839c61,Saturday U18 Girls Division 1/2,https://www.playhq.com/basketball-victoria/org...,18
4,f6460436-a741-4b78-96d9-1b3cfe984fb9,Magic U18 Boys Purple,6d022389-498f-459c-a08c-b958cfa1eb2c,Saturday U18 Boys Division 4,https://www.playhq.com/basketball-victoria/org...,18
5,fd2ca9b6-8345-420d-9496-31620a0a1caf,Magic U16 Girls Purple,f336e651-9ad6-46bb-9ec8-7b2b13baf568,Saturday U16 Girls Division 4,https://www.playhq.com/basketball-victoria/org...,16
6,1a66c6a8-89ba-4838-b076-7724a0488675,Magic U16 Girls Gold,f336e651-9ad6-46bb-9ec8-7b2b13baf568,Saturday U16 Girls Division 4,https://www.playhq.com/basketball-victoria/org...,16
7,529ce712-83f0-439e-9d12-a63a6769037f,Magic U16 Boys Purple,cddbe9fa-645e-4703-affa-0837d9d1bc56,Saturday U16 Boys Division 5,https://www.playhq.com/basketball-victoria/org...,16
8,9ce81394-0bce-4abc-bf1e-2c59ef687f0a,Magic U16 Boys Gold,a191049a-92b6-4f63-abfa-da42b0fc4947,Saturday U16 Boys Division 2,https://www.playhq.com/basketball-victoria/org...,16
9,d1c4722e-7e84-4577-85a0-98707dab65d2,Magic U16 Boys Diamond,1b0adca9-2085-4f25-8100-7e674d2e2adc,Saturday U16 Boys Division 3,https://www.playhq.com/basketball-victoria/org...,16


## 2. Get the upcoming games for those teams

Using the teams of the club, extract all upcoming games for the Club's teams.

In [4]:
upcoming_games_df = phq_club.get_games(teams_df, GAME_DATE_TIMESTAMP)

if upcoming_games_df is not None:
    print(f'There were {upcoming_games_df.shape[0]} games extracted for game day: {GAME_DATE_NAME}')
    upcoming_games_df[phq.GAMES_COLS]
else:
    print("No games for date: ", GAME_DATE_NAME)

2022-10-12 16:23:27 INFO Games extracted for team: Magic U8 Mixed Purple
2022-10-12 16:23:27 INFO Games extracted for team: Magic U8 Mixed Gold
2022-10-12 16:23:28 INFO Games extracted for team: Magic U8 Mixed Black
2022-10-12 16:23:28 INFO Games extracted for team: Magic U18 Girls Gold
2022-10-12 16:23:28 INFO Games extracted for team: Magic U18 Boys Purple
2022-10-12 16:23:28 INFO Games extracted for team: Magic U16 Girls Purple
2022-10-12 16:23:29 INFO Games extracted for team: Magic U16 Girls Gold
2022-10-12 16:23:29 INFO Games extracted for team: Magic U16 Boys Purple
2022-10-12 16:23:29 INFO Games extracted for team: Magic U16 Boys Gold
2022-10-12 16:23:30 INFO Games extracted for team: Magic U16 Boys Diamond
2022-10-12 16:23:30 INFO Games extracted for team: Magic U16 Boys Black
2022-10-12 16:23:30 INFO Games extracted for team: Magic U14 Girls Black
2022-10-12 16:23:30 INFO Games extracted for team: Magic U14 Girls Purple
2022-10-12 16:23:31 INFO Games extracted for team: Magic

There were 27 games extracted for game day: Saturday October 15, 2022 (2022/10/15)


## 3. Convert to TeamApp CSV format

Next, we convert the PlayHQ upcoming games to Teams App format so we can produce a CSV file to be imported into Teams App.

In [5]:
games_tapps_df = utils.to_teamsapp_schedule(upcoming_games_df, desc_template=DESC_TAPP, game_duration=45)
print("Done computing the games for Teams App")

games_tapps_df.sample(3)
games_tapps_df

Done computing the games for Teams App


Unnamed: 0,event_name,team_name,start_date,end_date,start_time,end_time,description,venue,location,access_groups,rsvp,comments,attendance_tracking,duty_roster,ticketing
0,U8 Mixed Purple - Round 2,U8 Mixed Purple,2022-10-15,2022-10-15,08:30:00,09:15:00,RSVP mandatory for the game.\n\nOpponent: Newl...,Coburg Basketball Stadium,"25 Outlook Road, Coburg North",U8 Mixed Purple,1,1,0,1,0
1,U8 Mixed Gold - Round 2,U8 Mixed Gold,2022-10-15,2022-10-15,08:30:00,09:15:00,RSVP mandatory for the game.\n\nOpponent: Pant...,Coburg North Primary School,"180 OHEA ST, COBURG",U8 Mixed Gold,1,1,0,1,0
2,U8 Mixed Black - Round 2,U8 Mixed Black,2022-10-15,2022-10-15,08:30:00,09:15:00,RSVP mandatory for the game.\n\nOpponent: St F...,Coburg Basketball Stadium,"25 Outlook Road, Coburg North",U8 Mixed Black,1,1,0,1,0
3,U18 Girls Gold - Round 2,U18 Girls Gold,2022-10-15,2022-10-15,16:00:00,16:45:00,RSVP mandatory for the game.\n\nOpponent: Newl...,Coburg Basketball Stadium,"25 Outlook Road, Coburg North",U18 Girls Gold,1,1,0,1,0
4,U18 Boys Purple - Round 2,U18 Boys Purple,2022-10-15,2022-10-15,16:00:00,16:45:00,RSVP mandatory for the game.\n\nOpponent: Newl...,Pascoe Vale Girls College,"Cornwall Road, Pascoe Vale",U18 Boys Purple,1,1,0,1,0
5,U16 Girls Purple - Round 2,U16 Girls Purple,2022-10-15,2022-10-15,14:30:00,15:15:00,RSVP mandatory for the game.\n\nOpponent: Newl...,Northcote High School,"19-25 St Georges Road, Northcote",U16 Girls Purple,1,1,0,1,0
6,U16 Girls Gold - Round 2,U16 Girls Gold,2022-10-15,2022-10-15,13:45:00,14:30:00,RSVP mandatory for the game.\n\nOpponent: Thun...,Dallas Brooks Community Primary School,"26-36 King Street, Dallas",U16 Girls Gold,1,1,0,1,0
7,U16 Boys Purple - Round 2,U16 Boys Purple,2022-10-15,2022-10-15,15:15:00,16:00:00,RSVP mandatory for the game.\n\nOpponent: Warr...,Dallas Brooks Community Primary School,"26-36 King Street, Dallas",U16 Boys Purple,1,1,0,1,0
8,U16 Boys Gold - Round 2,U16 Boys Gold,2022-10-15,2022-10-15,13:00:00,13:45:00,RSVP mandatory for the game.\n\nOpponent: STAR...,Oak Park Stadium,"9 Hillcrest Road, Oak Park",U16 Boys Gold,1,1,0,1,0
9,U16 Boys Diamond - Round 2,U16 Boys Diamond,2022-10-15,2022-10-15,13:00:00,13:45:00,RSVP mandatory for the game.\n\nOpponent: Pira...,Northcote High School,"19-25 St Georges Road, Northcote",U16 Boys Diamond,1,1,0,1,0


Inspect how the description of one of the games will look like:

In [6]:
team = "U12 Girls Gold"

# Inspect description of one record
print("Description for:", games_tapps_df.iloc[18]['event_name'], "\n")
print(games_tapps_df.iloc[18]['description'])



Description for: U12 Girls Gold - Round 2 

RSVP mandatory for the game.

Opponent: St Fidelis U12 Girls Blue
Venue: Coburg North Primary School (Court 1)
Address: 180 OHEA ST, COBURG 
Google Maps coord: https://maps.google.com/?q=(-37.7351,144.95113)

- Please ensure you arrive early and ready.
- Remember that shorts should have no pockets, players should not wear bracelets/watch as it is a risk of injury.
- No food in the venue and pickup your rubbish.
- Games will have 2x20 min halves.
- Each team needs to provide a scorer. TMs, please consider a roster.
- Players should not bring balls into the venue - game balls provided by Magic in coach's equipment bag.
- Beginners refs will be wearing green shirts. Please support and respect them through a POSITIVE sideline behaviour.

Check the game in PlayHQ: https://tinyurl.com/2lrsdxe7
Check the round in PlayHQ: https://tinyurl.com/2l2f5d6g
All clubs in PlayHQ: https://bit.ly/bmbc-s22



### Extract BYE games

We now extract the teams for which we couldn't scrape a game. In most cases this means a BYE for those teams.

In [7]:
playing_teams = upcoming_games_df['team_id'].tolist()
bye_teams = teams_df.loc[~teams_df['id'].isin(playing_teams)]['name'].tolist()
bye_teams = list(map(lambda x: re.search("U.*", x).group(0), bye_teams))

print(f"Bye teams ({len(bye_teams)}): ", bye_teams)

Bye teams (0):  []


Build a table with BYE games for those teams.

In [8]:
# Extract the date of the round
# date = team_apps_csv_df.iloc[1]['start_date']
print(f"Extract BYE games for games on {GAME_DATE_NAME}")

# Extract teams that do not have a game
if bye_teams:
    games_bye_df = utils.build_teamsapp_bye_schedule(bye_teams, GAME_DATE)
else:
    print("No BYE games this round...")

games_bye_df

Extract BYE games for games on Saturday October 15, 2022 (2022/10/15)
No BYE games this round...


NameError: name 'games_bye_df' is not defined

Finally, put together upcoming games and BYE games in a single table that will later be used to produce a CSV for TeamApp schedule import.

In [9]:
if bye_teams:
    team_apps_csv_df = pd.concat([games_tapps_df, games_bye_df])
    team_apps_csv_df.drop_duplicates(inplace=True)
    team_apps_csv_df.reset_index(inplace=True, drop=True)
else:
    team_apps_csv_df = games_tapps_df

team_apps_csv_df.sample(4)

Unnamed: 0,event_name,team_name,start_date,end_date,start_time,end_time,description,venue,location,access_groups,rsvp,comments,attendance_tracking,duty_roster,ticketing
17,U12 Girls Purple - Round 2,U12 Girls Purple,2022-10-15,2022-10-15,10:45:00,11:30:00,RSVP mandatory for the game.\n\nOpponent: Thun...,Northcote High School,"19-25 St Georges Road, Northcote",U12 Girls Purple,1,1,0,1,0
22,U12 Boys Black - Round 2,U12 Boys Black,2022-10-15,2022-10-15,10:00:00,10:45:00,RSVP mandatory for the game.\n\nOpponent: St F...,Dallas Brooks Community Primary School,"26-36 King Street, Dallas",U12 Boys Black,1,1,0,1,0
21,U12 Boys Diamond - Round 2,U12 Boys Diamond,2022-10-15,2022-10-15,10:00:00,10:45:00,RSVP mandatory for the game.\n\nOpponent: Newl...,St John's College (Preston),"21 Railway Place West, Preston",U12 Boys Diamond,1,1,0,1,0
2,U8 Mixed Black - Round 2,U8 Mixed Black,2022-10-15,2022-10-15,08:30:00,09:15:00,RSVP mandatory for the game.\n\nOpponent: St F...,Coburg Basketball Stadium,"25 Outlook Road, Coburg North",U8 Mixed Black,1,1,0,1,0


## 5. Save to CSV file for Teams App import

### 5.1. FINAL CHECK

Finally, report the games to be written into Schedule CSV file and **CHECK ALL IS GOOD!**

Particularly, look for games that are schedule but **PENDING** and without all details (time or venue).

In [10]:
team_apps_csv_df.columns
team_apps_csv_df[['team_name', 'start_date', 'start_time', 'venue']]

Unnamed: 0,team_name,start_date,start_time,venue
0,U8 Mixed Purple,2022-10-15,08:30:00,Coburg Basketball Stadium
1,U8 Mixed Gold,2022-10-15,08:30:00,Coburg North Primary School
2,U8 Mixed Black,2022-10-15,08:30:00,Coburg Basketball Stadium
3,U18 Girls Gold,2022-10-15,16:00:00,Coburg Basketball Stadium
4,U18 Boys Purple,2022-10-15,16:00:00,Pascoe Vale Girls College
5,U16 Girls Purple,2022-10-15,14:30:00,Northcote High School
6,U16 Girls Gold,2022-10-15,13:45:00,Dallas Brooks Community Primary School
7,U16 Boys Purple,2022-10-15,15:15:00,Dallas Brooks Community Primary School
8,U16 Boys Gold,2022-10-15,13:00:00,Oak Park Stadium
9,U16 Boys Diamond,2022-10-15,13:00:00,Northcote High School


We stop the execution here if we are running all Jupyter notebook.

In [None]:
raise SystemExit("Stop right there! Continue below to produce the CSV file if needed.")

### 5.2. Write a TeamAPP Schedule CSV

Finally, we save the data to a CSV file that can be imported into the [SCHEDULE of TeamsApp for all Entries](https://brunswickmagicbasketball.teamapp.com/clubs/263995/events?_list=v1&team_id=all).

In [12]:
if not os.path.exists(OUTPUT_PATH):
  os.makedirs(OUTPUT_PATH)

file_csv = os.path.join(OUTPUT_PATH, f"schedule-teamsapp-{GAME_DATE.strftime('%Y_%m_%d')}.csv")

print(f'Saving TeamAPP schedule CSV file for games on game date: {GAME_DATE_NAME}')
print('File to save TeamApp schedule:', file_csv)
team_apps_csv_df.to_csv(file_csv, index=False)

Saving TeamAPP schedule CSV file for games on game date: Saturday October 15, 2022 (2022/10/15)
File to save TeamApp schedule: fixture/schedule-teamsapp-2022_10_15.csv


### 5.3. Write fixture dataframe too



In [13]:
import datetime
import os

now = datetime.datetime.now() # current date and time

now_str = now.strftime("%Y-%m-%d_%H:%M:%S")
upcoming_games_df.to_pickle(os.path.join(OUTPUT_PATH, f"upcoming_games_df-{now_str}.pkl"))
team_apps_csv_df.to_pickle(os.path.join(OUTPUT_PATH, f"team_apps_csv_df-{now_str}.pkl"))

print(f"Finished saving dataframes: {now.strftime('%d/%m/%Y, %H:%M:%S')}")

Finished saving dataframes: 12/10/2022, 16:24:28


# ------------ END FIXTURE PUBLISHING ------------

## 6. Re-check Fixture

On Friday, just before the game day, re-check to see if any game has been changed (e.g., venue, time).

First, re-extract old upcoming games:

In [None]:
# Recover pikle saved dataframe (comment if using above directly)
FILE = "upcoming_games_df-2022-10-12_09:47:32.pkl"
upcoming_games_df = pd.read_pickle(os.path.join(OUTPUT_PATH, FILE))

upcoming_games_df[phq.GAMES_COLS]

Now scrape the new set of upcoming games:

In [None]:
upcoming_games_new_df = phq_club.get_games(teams_df, GAME_DATE_TIMESTAMP)

print(f'There were {upcoming_games_new_df.shape[0]} games extracted for game day: {GAME_DATE_NAME}')
upcoming_games_new_df[phq.GAMES_COLS]

Now, check for differences if any:

In [None]:
cols = ['team_name', 'schedule_timestamp', 'venue_name']

teams_changed = pd.concat([upcoming_games_new_df[cols], upcoming_games_df[cols]]).drop_duplicates(keep=False)['team_name'].unique()
print("Teams whose games have changed (updated, new, dropped):", teams_changed)

old_games_df = upcoming_games_df[cols].query("team_name in @teams_changed")
new_games_df = upcoming_games_new_df[cols].query("team_name in @teams_changed")
changed_games_df = new_games_df.merge(old_games_df, how="inner", on="team_name", suffixes=('_new', '_old'))

changed_games_df

Show the games that have changed, are new, or have been dropped:

### Check a particular team

In [None]:
upcoming_games_df.query("team_name == 'U14 Girls Black'")