# PlayHQ Fixture Scraping

This system allows to scrape game fixtures from [PlayHQ](http://playhq.com/
) via its Public [API](https://support.playhq.com/hc/en-au/sections/4405422358297-PlayHQ-APIs). 

It will produce a CSV file ready to be uploaded as Schedule in [TeamApp](https://brunswickmagicbasketball.teamapp.com/).

The *Public* APIs only require a header parameters to get a successful response, which includes `x-api-key` (also referred to as the Client ID) and `x-phq-tenant` (refers to the sport/association - in this case `bv`). Note that the Private APIs are not available to clubs and associations.

Detailed reference documentation for PlayHQ API can be found [here](https://docs.playhq.com/tech).

Contact: Sebastian Sardina (sssardina@gmail.com)

In [1]:
# from IPython.core.interactiveshell import InteractiveShell
# InteractiveShell.ast_node_interactivity = "all"

import pandas as pd
import re
import os
import calendar, datetime

# Set-up everything if running in Google Colab
if "COLAB_GPU" in os.environ:
  %pip install pyshorteners
  %pip install coloredlogs
  for f in ['utils.py', 'playhq.py', 'config.py']:
    if not os.path.exists(f):
      !wget "https://raw.githubusercontent.com/ssardina/tapp-fixture/main/{f}"

from config import *
import utils
import playhq as phq

## 1. Set-up application

First, creating a connection to the PlayHQ Public API. Remember to set-up file `config.py` for club and season configuration.


The *Public* APIs only require the below header parameters to get a successful response:

- `x-api-key` (also referred to as the Client ID) will be provided by PlayHQ when you request access to the public API via their [support page](https://support.playhq.com/hc/en-au) or email support@playhqsupport.zendesk.com.
- `x-phq-tenant` usually refers to the sport/association - in this case '`bv`'.

The feature to create new API credentials is disabled for you and can only be actioned by a Super Administrator role within the Play HQ portal. Please use the credentials provided to call the PlayHQ public APIs.

In [2]:
GAME_DATE = utils.next_day(calendar.SATURDAY) # get the date of the upcoming Saturday (game day in competition)
GAME_DATE_TIMESTAMP = pd.to_datetime(GAME_DATE).tz_localize(TIMEZONE)
GAME_DATE_NAME = GAME_DATE_TIMESTAMP.strftime("%A %B %d, %Y (%Y/%m/%d)") # Saturday August 06, 2022

phq_club = phq.PlayHQ(CLUB_NAME, ORG_ID, X_API_KEY, X_TENANT, TIMEZONE)
print(f"Set-up games for club {CLUB_NAME} for upcoming Saturday is: {GAME_DATE_NAME}")

PLAYHQ_URL=f"https://bv.playhq.com/org/{ORG_ID}/games?date={GAME_DATE_TIMESTAMP.strftime('%Y-%m-%d')}"
print("Game day at PlayHQ:", PLAYHQ_URL)
print("Club PlayHQ link: ", PLAYHQ_CLUB_SEASON)

Set-up games for club Brunswick Magic Basketball Club for upcoming Saturday is: Saturday September 10, 2022 (2022/09/10)
Game day at PlayHQ: https://bv.playhq.com/org/8c4d5431-eaa5-4644-82ac-992abe224b88/games?date=2022-09-10
Club PlayHQ link:  https://bit.ly/bmbc-w22


Next check if PlayHQ x-api-key is available already, otherwise ask user for it to access PlayHQ API.

In [3]:
if X_API_KEY is None:
  X_API_KEY = input("Enter your x-api-key:")

print("x-api-key is defined!")

x-api-key is defined!


Now get the teams of the club; it will be used below.

In [4]:
season_id = phq_club.get_season_id(SEASON)
teams_df = phq_club.get_season_teams(season_id)
teams_df

Unnamed: 0,id,name,grade.id,grade.name,grade.url,age
0,d7486008-b2fe-47db-8011-01265eaf1cfe,Magic U8 Mixed Purple,be7076cb-a1de-4033-8082-b52c8c149861,Saturday U8 Mixed Division 1/2,https://www.playhq.com/basketball-victoria/org...,8
1,8d635867-0718-4f70-a1b8-1a224992d294,Magic U16 Girls Gold,19b86372-2ad5-4a5a-9592-467f826f7d63,Saturday U16 Girls Division 1/2,https://www.playhq.com/basketball-victoria/org...,16
2,69d69214-641d-4ece-a2ce-ca9534865553,Magic U16 Boys Purple,be4b2b9c-0bc1-44e2-ac3a-b89bb10230f8,Saturday U16 Boys Division 3,https://www.playhq.com/basketball-victoria/org...,16
3,135c3bab-3004-4426-b9c6-3669b48ce271,Magic U16 Boys Gold,ceac1ec4-c86b-47c6-9e67-124d923ea713,Saturday U16 Boys Division 2,https://www.playhq.com/basketball-victoria/org...,16
4,99f369ed-5c63-40e7-8591-8cd18fba7ec0,Magic U14 Boys Gold,d900616b-0f52-4d3e-8090-4eaf877491bb,Saturday U14 Boys Division 5,https://www.playhq.com/basketball-victoria/org...,14
5,8b2bc68b-9fda-435c-9c13-297d1d3187d1,Magic U14 Girls Purple,b0785b1f-bec6-448b-a76d-83e1a6371fb3,Saturday U14 Girls Division 3,https://www.playhq.com/basketball-victoria/org...,14
6,9ebeb141-4023-4744-84f1-f58957b966b2,Magic U14 Girls Gold,b0785b1f-bec6-448b-a76d-83e1a6371fb3,Saturday U14 Girls Division 3,https://www.playhq.com/basketball-victoria/org...,14
7,2b844e90-6575-47bc-bb49-f55840f8e6f8,Magic U14 Girls Black,237a9d9a-52c5-4136-b179-c01b10ad93ae,Saturday U14 Girls Division 4,https://www.playhq.com/basketball-victoria/org...,14
8,623c00bb-302b-4492-a988-90aacc3e7768,Magic U14 Boys Purple,d900616b-0f52-4d3e-8090-4eaf877491bb,Saturday U14 Boys Division 5,https://www.playhq.com/basketball-victoria/org...,14
9,a1e571f6-8ac4-49d2-9787-f62b35947eca,Magic U14 Boys Black,b877a73b-037d-460b-acb5-a2c45e5debee,Saturday U14 Boys Division 2,https://www.playhq.com/basketball-victoria/org...,14


## 2. Get the upcoming games for those teams

Using the teams of the club, extract all upcoming games for the Club's teams.

In [None]:
upcoming_games_df = phq_club.get_games(teams_df, GAME_DATE_TIMESTAMP)

if upcoming_games_df is not None:
    print(f'There were {upcoming_games_df.shape[0]} games extracted for game day: {GAME_DATE_NAME}')
    upcoming_games_df[phq.GAMES_COLS]
else:
    print("No games for date: ", GAME_DATE_NAME)

## 3. Convert to TeamApp CSV format

Next, we convert the PlayHQ upcoming games to Teams App format so we can produce a CSV file to be imported into Teams App.

In [None]:
games_tapps_df = utils.to_teamsapp_schedule(upcoming_games_df, desc_template=DESC_TAPP, game_duration=45)
print("Done computing the games for Teams App")
games_tapps_df.sample(3)

In [None]:
# Inspect description of one record
print(games_tapps_df.iloc[4]['event_name'])
print(games_tapps_df.iloc[4]['description'])

### Extract BYE games

In [None]:
playing_teams = upcoming_games_df['team_id'].tolist()
bye_teams = teams_df.loc[~teams_df['id'].isin(playing_teams)]['name'].tolist()
bye_teams = list(map(lambda x: re.search("U.*", x).group(0), bye_teams))

print(f"Bye teams ({len(bye_teams)}): ", bye_teams)

In [None]:
# Extract the date of the round
# date = team_apps_csv_df.iloc[1]['start_date']
print(f"Extract BYE games for games on {GAME_DATE_NAME}")

# Extract teams that do not have a game
if bye_teams:
    games_bye_df = utils.build_teamsapp_bye_schedule(bye_teams, GAME_DATE)
else:
    print("No BYE games this round...")

games_bye_df

Finally, put together upcoming games and BYE games in a single DataFrame.

In [None]:
team_apps_csv_df = pd.concat([games_tapps_df, games_bye_df])
team_apps_csv_df.drop_duplicates(inplace=True)
team_apps_csv_df.reset_index(inplace=True, drop=True)

team_apps_csv_df



## 5. Save to CSV file for Teams App import

### 5.1. FINAL CHECK

Finally, report the games to be written into Schedule CSV file and **CHECK ALL IS GOOD!**

Particularly, look for games that are schedule but **PENDING** and without all details (time or venue).

In [None]:
team_apps_csv_df.columns
team_apps_csv_df[['team_name', 'start_date', 'start_time', 'venue']]

### 5.2. Write a TeamAPP Schedule CSV

Finally, we save the data to a CSV file that can be imported into the [SCHEDULE of TeamsApp for all Entries](https://brunswickmagicbasketball.teamapp.com/clubs/263995/events?_list=v1&team_id=all).

In [None]:
file_csv = os.path.join(OUTPUT_PATH, f"schedule-teamsapp-{date.strftime('%Y_%m_%d')}.csv")

print(f'Saving TeamAPP schedule CSV file for games on game date: {GAME_DATE_NAME}')
print('File to save TeamApp schedule:', file_csv)
team_apps_csv_df.to_csv(file_csv, index=False)

### 5.3. Write fixture dataframe too



In [None]:
import datetime
import os

now = datetime.datetime.now() # current date and time

now_str = now.strftime("%Y-%m-%d_%H:%M:%S")
upcoming_games_df.to_pickle(os.path.join(OUTPUT_PATH, f"upcoming_games_df-{now_str}.pkl"))
team_apps_csv_df.to_pickle(os.path.join(OUTPUT_PATH, f"team_apps_csv_df-{now_str}.pkl"))

print(f"Finished saving dataframes: {now.strftime('%d/%m/%Y, %H:%M:%S')}")

# ------------ END FIXTURE PUBLISHING ------------

## 6. Re-check Fixture

On Friday, just before the game day, re-check to see if any game has been changed (e.g., venue, time).

First, re-extract games in another dataframe:

In [None]:
# Recover pikle saved dataframe (comment if using above directly)
FILE = "upcoming_games_df-2022-09-06_17:14:47.pkl"
upcoming_games_df = pd.read_pickle(os.path.join(OUTPUT_PATH, FILE))

upcoming_games_df[phq.GAMES_COLS]

In [None]:
upcoming_games2_df = phq_club.get_games(teams_df, GAME_DATE_TIMESTAMP)

print(f'There were {upcoming_games2_df.shape[0]} games extracted for game day: {GAME_DATE_NAME}')
upcoming_games2_df[phq.GAMES_COLS]

Now, check for differences if any:

In [None]:
cols = ['team_name', 'schedule_timestamp', 'venue_name']

print("Report games that have changed since last extraction:")
teams_changed = pd.concat([upcoming_games2_df[cols], upcoming_games_df[cols]]).drop_duplicates(keep=False)['team_name'].unique()

changes_df = upcoming_games_df[cols].query("team_name in @teams_changed")
changes_df.merge(upcoming_games2_df[cols], how="outer", on="team_name", suffixes=('_new', '_old'))

teams_changed

In [None]:
upcoming_games_df

In [None]:
upcoming_games_df[cols].merge(upcoming_games2_df[cols], indicator = True, how='outer', on="team_name", suffixes=('_new', '_old')).loc[lambda x : x['_merge']!='both']


### Check a particular team

In [None]:
upcoming_games_df.query("team_name == 'U14 Girls Black'")