# **Table of Contents**

##### **IMPORTANT: Input Cleaned CSV, Shot CSV, Point CSV, and Point Json into "Cleaned" folder for the match in Google Drive**

### **Shot CSV:**
1. [Load in Data](#Load-Data)
2. [Manually Input Meta Data](#manually-input-meta-data)
- **You must manually type in values**
3. [Shot CSV Error Checks](#shot-csv-Error-Checks)
4. [Add Shot CSV Columns](#Add-Shot-CSV-Columns)
5. [Output Shot CSV](#Output-ShotCSV)

### **Point CSV:**
1. [Create Point CSV](#Create-PointCSV)
2. [Add Point CSV Columns](#add-point-csv-columns)
3. [Point CSV Error Checks](#point-csv-error-checks)
4. [Output Point CSV](#output-pointcsv)
- **Point CSV is for visuals** 
- **Point Json is for Upload to Website**

### **EDA:**
1. [Summary Stats](#Summary-stats)
2. [Serve and Return Stats](#serve-and-return-stats)
3. [Breakpoint Stats](#breakpoint-stats)
4. [Serve Win Percentage](#serve-win-percentage)
5. [Error Stats](#error-stats)

# **To Do:**
#### **Error Checks to Fix**
- [Check rows with mismatched serve in/serve zone](#Check-rows-with-mismatched-serve-in/serve-zone)
- [Check all points where double fault occurs (firstServeIn == 0 & secondServeIn == 0) but len(shotInRally) > 1](#check-all-points-where-double-fault-occurs-firstservein--0--secondservein--0-but-lenshotinrally--1)
- [Check all the points where everytime the server changes, the first pointScore should be "0-0". If not output error](#check-all-the-points-where-everytime-the-server-changes-the-first-pointscore-should-be-0-0-if-not-output-error)


#### **Columns**
- [Change isApproach to be aggregated from Coordinate Data](#isapproach-column)
        - see if the next consecutive shotInRally coordinates are further up into the court
- [Depths Count (Short, Deep) Columns](#depths-count-short-deep-columns)
- [Add Columns from Leo (isLet, serverLocation, returnerLocation?)](#reorder-dataframe-for-output)

# **Load Data**

In [58]:
import pandas as pd
import numpy as np
import os 
import re

# Option to display max rows/columns
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

# Put your Uncleaned .csv file name here
your_file_name = 'Shot_Visuals_KaylanBigun_MiguelPerezPena.csv'
shot_data = pd.read_csv(your_file_name)
shot_data.shape

(556, 71)

In [59]:
shot_data.head(40)

Unnamed: 0,pointScore,gameScore,setScore,isPointStart,pointStartTime,isPointEnd,pointEndTime,pointNumber,isBreakPoint,shotInRally,side,serverName,serverFarNear,firstServeIn,firstServeZone,firstServeXCoord,firstServeYCoord,secondServeIn,secondServeZone,secondServeXCoord,secondServeYCoord,isAce,shotContactX,shotContactY,shotDirection,shotFhBh,isSlice,isVolley,isOverhead,isApproach,isDropshot,isExcitingPoint,atNetPlayer1,atNetPlayer2,isLob,shotLocationX,shotLocationY,isWinner,isErrorWideR,isErrorWideL,isErrorNet,isErrorLong,clientTeam,Date,Division,Event,lineupPosition,matchDetails,matchVenue,opponentTeam,player1Name,player2Name,player1Hand,player2Hand,Round,Surface,Notes,isTopspin,isFlat,isKick,tiebreakScore,returnerName,shotHitBy,InsideOut,InsideIn,isDoubleFault,pointWonBy,lastShotError,serveResult,serveInPlacement,depth
0,0-0,0-0,0-0,1.0,50,,,1,,1,Deuce,Kaylan Bigun,Near,1.0,T,-21.194674,221.413851,,,,,,17.115704,-449.70491,,,1.0,,,,,,,,,,,,,,,,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,lfjdsfds,,,,,Miguel Perez Pena,Kaylan Bigun,,,,,,1st Serve In,T,
1,0-0,0-0,0-0,,920,,,1,,2,Deuce,Kaylan Bigun,Near,,,,,,,,,,-31.924945,544.516442,Down the Line,Backhand,,,,,,,,,,-9.225776,-260.806266,,,,,,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,,,1.0,,,Miguel Perez Pena,Miguel Perez Pena,,,,,,,,Short
2,0-0,0-0,0-0,,2280,,,1,,3,Deuce,Kaylan Bigun,Near,,,,,,,,,,2.080785,-415.949087,Down the Line,Backhand,,,,,,,,,,104.861324,270.769554,,,,,,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,,1.0,,,,Miguel Perez Pena,Kaylan Bigun,,1.0,,,,,,Short
3,0-0,0-0,0-0,,3450,1.0,3450.0,1,,4,Ad,Kaylan Bigun,Near,,,,,,,,,,125.669751,477.487653,Crosscourt,Backhand,,,,,,,,,,-16.562495,-463.428018,,,,,1.0,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,,1.0,,,,Miguel Perez Pena,Miguel Perez Pena,1.0,,,Kaylan Bigun,1.0,,,Long
4,15-0,0-0,0-0,1.0,22219,,,2,,1,Ad,Kaylan Bigun,Near,1.0,Wide,134.205211,196.300241,,,,,,-85.933727,-496.218662,,,1.0,,,,,,,,,,,,,,,,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,,,,,,Miguel Perez Pena,Kaylan Bigun,,,,,,1st Serve In,Wide,
5,15-0,0-0,0-0,,23180,,,2,,2,Ad,Kaylan Bigun,Near,,,,,,,,,,177.067848,507.447638,Down the Line,Backhand,,,,,,,,,,77.846829,-237.115737,,,,,,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,,1.0,,,,Miguel Perez Pena,Miguel Perez Pena,,1.0,,,,,,Short
6,15-0,0-0,0-0,,24580,1.0,24580.0,2,,3,Deuce,Kaylan Bigun,Near,,,,,,,,,,13.339157,-449.343695,Crosscourt,Backhand,,,,,,,,,,-75.353614,-11.698206,,,,1.0,,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,,1.0,,,,Miguel Perez Pena,Kaylan Bigun,1.0,,,Miguel Perez Pena,1.0,,,
7,15-15,0-0,0-0,1.0,53950,,,3,,1,Deuce,Kaylan Bigun,Near,1.0,T,-10.378855,234.102614,,,,,,26.922058,-483.034717,,,,,,,,,,,,,,,,,,,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,,,1.0,,,Miguel Perez Pena,Kaylan Bigun,,,,,,1st Serve In,T,
8,15-15,0-0,0-0,,54759,,,3,,2,Deuce,Kaylan Bigun,Near,,,,,,,,,,-49.898808,535.031802,Down the Line,Backhand,1.0,,,,,,,,,-39.165513,-178.602389,,,,,,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,,,,,,Miguel Perez Pena,Miguel Perez Pena,,,,,,,,Short
9,15-15,0-0,0-0,,56779,1.0,56779.0,3,,3,Ad,Kaylan Bigun,Near,,,,,,,,,,-51.998693,-331.67897,Down the Line,Backhand,,,,,,,,,,-156.886849,449.631266,1.0,,,,,UCLA,,Division 1,,,,,,Kaylan Bigun,Miguel Perez Pena,Right,Left,2,Hard,,1.0,,,,Miguel Perez Pena,Kaylan Bigun,,,,Kaylan Bigun,,,,Deep


# **Manually Input Meta Data**

In [60]:
# Fill in meta data
shot_data['clientTeam'] = 'UCLA'
shot_data['Date'] = ''
shot_data['Division'] = 'Division 1'
shot_data['Event'] = ''
shot_data.loc[0, 'lineupPosition'] = ''                 # use.loc[0, .] to only assign to first row
shot_data.loc[0, 'matchDetails'] = ''
shot_data['matchVenue'] = ''
shot_data['opponentTeam'] = ''
shot_data['player1Name'] = 'Kaylan Bigun'
shot_data['player2Name'] = 'Miguel Perez Pena'
shot_data['player1Hand'] = 'Right'
shot_data['player2Hand'] = 'Left'
shot_data['Round'] = '2'
shot_data['Surface'] = 'Hard'
shot_data.loc[0, 'Notes'] = 'lfjdsfds'

  shot_data.loc[0, 'lineupPosition'] = ''                 # use.loc[0, .] to only assign to first row
  shot_data.loc[0, 'matchDetails'] = ''


## Assign Player Names

In [61]:
player1_name = shot_data['player1Name'].iloc[0]
player2_name = shot_data['player2Name'].iloc[0]
p = shot_data['serverName'].unique()

if (player1_name == None) or (player1_name == ''):
    raise ValueError('player1Name is blank')

if (player2_name == None) or (player2_name == ''):
    raise ValueError('player1Name is blank')

# Assign values in the serverName column
shot_data['serverName'] = shot_data['serverName'].replace({'Player1': player1_name, 'Player2': player2_name})

if not len(shot_data['serverName'].unique()) == 2:
    print(shot_data['serverName'].unique())
    raise ValueError('Unkown Name!')

print('Check Passed ✓')

Check Passed ✓


# **Shot CSV Error Checks**

#### Check for missing player1Hand and player2Hand

In [62]:
player1_hand =  shot_data.at[0, 'player1Hand']
player2_hand =  shot_data.at[0, 'player2Hand']

if (player1_hand == None) or (player1_hand == ''):
    raise ValueError('player1Hand has no value!')

if (player2_hand == None) or (player2_hand == '') == None:
    raise ValueError('player2Hand has no value!')

print('Check Passed ✓')

Check Passed ✓


#### Check for missing 'Deuce' or 'Ad' sides in side column

In [63]:
# Dataframe of all missing sides
missing_sides = shot_data[~shot_data['side'].isin(['Deuce', 'Ad'])][['pointScore', 'gameScore', 'setScore','side', 'shotInRally']]

if len(missing_sides) > 0:
    display(missing_sides)
    raise ValueError('Missing Deuce or Ad value in sides Column')
else: print('Check Passed ✓')

Check Passed ✓


####  Check pointNumber increases consecutively
- swingvision data: check score columns (most likely incorrect) and check how the point ends columns 
(isErrorLong, isErrorWideR, isErrorWideL, isErrorNet likely marked wrong)
- Fix all points and then delete all values from pointNumber column to let pointNumber fill

In [64]:
# Fill in pointNumber if blank
if 'pointNumber' not in shot_data.columns or shot_data['pointNumber'].isnull().any():
    point_starts = (shot_data['isPointStart'] == 1)
    shot_data['pointNumber'] = point_starts.cumsum()

# Check if pointNumber incresase consecutively
point_numbers = shot_data['pointNumber'].unique()
non_consecutive = [point_numbers[i] for i in range(1, len(point_numbers)) if point_numbers[i] != point_numbers[i-1] + 1]

if non_consecutive:
    non_consecutive_rows = shot_data[shot_data['pointNumber'].isin(non_consecutive)]
    display(non_consecutive_rows[['pointScore', 'gameScore', 'setScore', 'pointNumber']])
    print(non_consecutive_rows['pointNumber'].unique())
    raise ValueError(f"Non-consecutive point numbers found: {list(non_consecutive_rows['pointNumber'].unique())}")
else: print('Check Passed ✓')

Check Passed ✓


#### Check for NA values
- all should be 0, except if there's a tiebreak then missing pointScore should match amount of tiebreak shots

In [65]:
# Count empty strings in each column
empty_string_counts = ((shot_data.isna()).sum())
non_zero_counts = empty_string_counts[(empty_string_counts == shot_data.shape[0])]

na_counts = shot_data[[ 'pointScore', 'shotInRally', 'gameScore', 'setScore', 'side', 'serverName']].isna().sum()
has_na = shot_data[['pointScore', 'shotInRally', 'gameScore', 'setScore', 'side', 'serverName']].isna().any().any()

if has_na:
    display(na_counts)
    raise ValueError('Empty of cells in these columns')
else: 
    print('Check Passed ✓')
    display(non_zero_counts)

Check Passed ✓


isApproach         556
isDropshot         556
isExcitingPoint    556
isLob              556
tiebreakScore      556
dtype: int64

####  Check for missing shotInRally rows

In [66]:
# All rows of missing shotInRally
empty_shot_rows = shot_data[shot_data['shotInRally'].isnull()]

if not empty_shot_rows.empty:
    display(empty_shot_rows[['pointScore', 'gameScore', 'setScore', 'pointStartTime', 'shotInRally']])
else:
    print('Check Passed ✓')


Check Passed ✓



#### Check missing isPointStart and isPointEnd


In [67]:
missing_point_start = list()
missing_point_end = list()

for i in point_numbers:
    current_df = shot_data[shot_data['pointNumber'] == i].reset_index(drop=True)
    if not current_df.loc[0,'isPointStart'] == 1:
        # Find the shot_data index of the missing isPointStart row
        index_start = shot_data.index[(shot_data['pointNumber'] == i) & (shot_data['shotInRally'] == 1)][0]
        missing_point_start.append(index_start)

    if not current_df.loc[len(current_df) - 1,'isPointEnd'] == 1:
        # Find last shotInRally of current_df
        last_rally = current_df['shotInRally'].unique()[-1]
        index_end = shot_data.index[(shot_data['pointNumber'] == i) & (shot_data['shotInRally'] == last_rally)][0]
        missing_point_end.append(index_end)

if (len(missing_point_start) > 0) or (len(missing_point_end) > 0):
    print('Number of rows with isPointStart = 1:', len(missing_point_start))
    print('Number of rows with isPointEnd = 1:', len(missing_point_end), '\n')
    print('Missing isPointStart rows:')
    display(shot_data.loc[missing_point_start])
    print('Missing isPointEnd rows:')
    display(shot_data.loc[missing_point_end])
    raise ValueError('Missing values!')

print('Check Passed ✓')

Check Passed ✓


#### Check for same amount of isPointStart and isPointEnd

In [68]:
# Count of isPointStart and isPointEnd
num_point_start = shot_data['isPointStart'].sum()
num_point_end = shot_data['isPointEnd'].sum()

if num_point_start != num_point_end:
    print("Number of rows with isPointStart = 1:", num_point_start)
    print("Number of rows with isPointEnd = 1:", num_point_end)
    raise ValueError("Error: count of isPointStart = 1 and isPointEnd = 1 are not the same.")
else: print('Check Passed ✓')

Check Passed ✓


#### Check that score doesn't have incorrect date format
- accounts for all variations of dates eg. (0-00-0000, 0/0/0000)

In [69]:
# Make Jan-00 back into 1-0 for Game/Set Score
# Make Scores Strings not Date Time
columns_to_convert = ['gameScore', 'setScore']
shot_data[columns_to_convert] = shot_data[columns_to_convert].astype(object)

# Define a mapping for month abbreviations
month_mapping = {'Jan': '1', 'Feb': '2', 'Mar': '3', 'Apr': '4', 'May': '5', 'Jun': '6',
                 'Jul': '7', 'Aug': '8', 'Sep': '9', 'Oct': '10', 'Nov': '11', 'Dec': '12'}

# Function to convert string like 'Jan-00' to '1-0'
def convert_score_string(score_str):
    # Check if the string has a month abbreviation and a year ending with '00'
    if re.match(r'^\d{1,2}-[A-Za-z]{3}$', score_str):
        # Extract year and month abbreviation
        year, month = score_str.split('-')

        # Remove leading zeros from the year
        year = str(int(year))

        # Replace month abbreviation with corresponding number
        month_number = month_mapping.get(month, month)

        # Concatenate the parts to form the transformed string
        transformed_str = f'{year}-{month_number}'
        return transformed_str

    # Check if the string has a month abbreviation and a year with leading '0's
    elif re.match(r'^[A-Za-z]{3}-\d{1,2}$', score_str):
        # Extract month abbreviation and year
        month, year = score_str.split('-')

        # Replace month abbreviation with corresponding number
        month_number = month_mapping.get(month, month)

        # Remove leading zeros from the year
        year = str(int(year))

        # Concatenate the parts to form the transformed string
        transformed_str = f'{month_number}-{year}'
        return transformed_str

    # Check if the string has a date in the format 'month/day/year'
    elif re.match(r'^\d{1,2}/\d{1,2}/\d{4}$', score_str):
        # Extract month, day, and year
        month, day, year = score_str.split('/')

        # Remove leading zeros from month and day
        month = str(int(month))
        day = str(int(day))

        # Concatenate the parts to form the transformed string
        transformed_str = f'{month}-{day}'
        return transformed_str

    # Check if the string has a date in the format 'month-day-year'
    elif re.match(r'^\d{1,2}-\d{1,2}-\d{4}$', score_str):
        # Extract month, day, and year
        month, day, year = score_str.split('-')

        # Remove leading zeros from month and day
        month = str(int(month))
        day = str(int(day))

        # Concatenate the parts to form the transformed string
        transformed_str = f'{month}-{day}'
        return transformed_str

    return score_str

# Apply the conversion function to the relevant columns in shot_data
shot_data['gameScore'] = shot_data['gameScore'].apply(convert_score_string)
shot_data['setScore'] = shot_data['setScore'].apply(convert_score_string)


#### Check for incorrect game and set scores

In [70]:
# Assuming shot_data is your DataFrame
set_scores = shot_data['setScore'].unique()
game_scores = shot_data['gameScore'].unique()

print("Unique Set Scores:\n", set_scores)
print("Unique Game Scores:\n", game_scores)

Unique Set Scores:
 ['0-0' '0-1']
Unique Game Scores:
 ['0-0' '0-1' '0-2' '2-1' '3-1' '3-2' '3-3' '3-4' '3-5' '4-5' '5-5' '6-5'
 '4-2' '5-2' '5-3' '5-4']


#### Check all rows where isPointStart does not start at beginning of rally

In [71]:
filtered_rows = shot_data[(shot_data['isPointStart'] == 1) & (shot_data['shotInRally'] != 1)]

if filtered_rows.empty:
    print('Check Passed ✓')
else:
    display(filtered_rows)
    raise ValueError('Rows where isPointStart = 1 and shotInRally != 1')

Check Passed ✓


#### Find the rows where isPointEnd = 1 and shotInRally = 1 but is neither an ace or a double fault

In [72]:
filtered_rows = shot_data[
    (shot_data['isPointEnd'] == 1) & 
    (shot_data['shotInRally'] == 1) & 
    (shot_data['firstServeIn'] != 0) &
    (shot_data['secondServeIn'] != 0) &
    (shot_data['isAce'] != 1)
]

if filtered_rows.empty:
    print('Check Passed ✓')
else:
    display(filtered_rows)
    raise ValueError('Rows where isPointEnd = 1, shotInRally = 1, firstServeIn and secondServeIn are both not 0, and isAce != 1')

Check Passed ✓



#### Check rows where there are duplicate isPointStart = 1 and isPointEnd points
- WARNING: This will output rows that have pointScore such as "40-40", "40-A", "A-40" (Ad Scoring)
- Ignore if so

In [73]:
# Output rows where isPointStart is 1 and pointScore, gameScore, and setScore have the same value
filtered_rows = shot_data[shot_data['isPointStart'] == 1]
output_rows = filtered_rows[filtered_rows.duplicated(subset=['pointScore', 'gameScore', 'setScore'], keep=False)]

if not output_rows.empty:
    display(output_rows)
    raise ValueError("Rows where isPointStart is 1 and pointScore, gameScore, and setScore have the same value:")

# Output rows where isPointEnd is 1 and pointScore, gameScore, and setScore have the same value
filtered_rows = shot_data[shot_data['isPointEnd'] == 1]
output_rows = filtered_rows[filtered_rows.duplicated(subset=['pointScore', 'gameScore', 'setScore'], keep=False)]

if not output_rows.empty:
    display(output_rows[['pointScore', 'gameScore', 'setScore', 'isPointStart']])
    raise ValueError('Rows where isPointEnd is 1 and pointScore, gameScore, and setScore have the same value!')

print('Check Passed ✓')


Check Passed ✓


#### Check if shotInRally is duplicated or consectively increasing

In [74]:
# Check if shotInrally is not duplicated
output_rows = shot_data[shot_data.duplicated(subset=['pointScore', 'gameScore', 'setScore', 'shotInRally'], keep=False)]['pointNumber'].unique().tolist()
rows = shot_data[shot_data.duplicated(subset=['pointScore', 'gameScore', 'setScore', 'shotInRally'], keep=False)]

if len(output_rows) > 0:
    display(rows)
    raise ValueError(f'Duplicated shotInRally rows! \n Check pointNumber(s): {output_rows}')

# Check if shotInRally is consectively increasing
shotInRally_error = list()

for i in point_numbers:
    current_df = shot_data[shot_data['pointNumber'] == i]
    # Check if shotInrRally in current pointNumber is strictly increasing
    if not (current_df['shotInRally'].diff().dropna() > 0).all():
        shotInRally_error.append(i)
        
if len(shotInRally_error) > 0:
    raise ValueError(f'shotInRally not consectively increasing! \n Check pointNumber(s): {shotInRally_error}')


print('Check Passed ✓')

Check Passed ✓


#### Check all the rows where isPointEnd != 1 and there is  isWinner, isErrorWideL, isErrorWideR, isErrorNet, isErrorLong

In [75]:
point_error = shot_data[(shot_data['isPointEnd'] != 1) & 
                        (shot_data['isPointStart'] != 1) & # Added isPointStart: swingvision marks isErrorNet = 1 for serves
                        ((shot_data['isWinner'] == 1) | 
                         (shot_data['isErrorNet'] == 1) | 
                         (shot_data['isErrorLong'] == 1) |
                         (shot_data['isErrorWideL'] == 1) |
                         (shot_data['isErrorWideR'] == 1))]

point_error_numbers = point_error['pointNumber'].to_list()

if len(point_error) > 0:
    display(point_error)
    raise ValueError('Manually check points', point_error_numbers)

print('Check Passed ✓')


Check Passed ✓


#### Check all the rows where there is isPointEnd == 1 but there is no isWinner, isErrorWideL, isErrorWideR, isErrorNet, isErrorLong
- Cj recommendation: have this error check automatically fill in how the point ends based on coordinate data

In [76]:
point_error = shot_data[(shot_data['isPointEnd'] == 1) &
                          (shot_data['isWinner'] != 1) &
                          (shot_data['isErrorWideL'] != 1) &
                          (shot_data['isErrorWideR'] != 1) &
                          (shot_data['isErrorNet'] != 1) & 
                          (shot_data['isErrorLong'] != 1) &
                          (shot_data['firstServeIn'] != 0) & 
                          (shot_data['secondServeIn'] != 0)]

point_error_numbers = point_error['pointNumber'].to_list()

if point_error.empty:
    print('Check Passed ✓')
else:
    display(point_error)
    raise ValueError('Manually check points', point_error_numbers)

Check Passed ✓


#### Check rows with mismatched serve in/serve zone
- **NEED TO DO: FIX, logic is wrong with "x+2"**

In [77]:
# Find indices of all rows with firstServeIn data
first_serve_row = shot_data.index[shot_data['firstServeIn'].notnull()]
# Find indices of all rows with firstServeZone data
first_serve_placement_row = shot_data.index[shot_data['firstServeZone'].notnull()]
# Check which indices are not found in one column but are found in the other
mismatched_first_serve_rows = list(set(first_serve_row).difference(first_serve_placement_row))
# Add 2 to the indices to match numbering of Google Sheets
mismatched_first_serve_rows = [x+2 for x in mismatched_first_serve_rows]
if not mismatched_first_serve_rows:
    print("Check passed for first serves.")
else:
    print("Rows where first serve in and first serve zone are not found together: " + str(mismatched_first_serve_rows))

second_serve_row = shot_data.index[shot_data['secondServeIn'].notnull()]
second_serve_placement_row = shot_data.index[shot_data['secondServeZone'].notnull()]
mismatched_second_serve_rows = list(set(second_serve_row).difference(first_serve_placement_row))
mismatched_second_serve_rows = [x+2 for x in mismatched_second_serve_rows]
if not mismatched_second_serve_rows:
    print("Check passed for second serves.")
else:
    print("Rows where second serve in and second serve zone are not found together: " + str(mismatched_second_serve_rows))

Check passed for first serves.
Rows where second serve in and second serve zone are not found together: [449]


#### Check all points where double fault occurs (firstServeIn == 0 & secondServeIn == 0) but len(shotInRally) > 1
- **NEED TO DO: Check double fault but the point continues**

#### Check all the points where everytime the server changes, the first pointScore should be "0-0". If not output error
- **NEED TO DO: Check incorrect scoring**
- **swing_vision Govind Nanda vs Cooper Williams (Harvard) row 380**

# **Add Shot CSV Columns**

#### tiebreakScore Column

In [78]:
def reverse_point_score(score):
    if '-' in score:
        parts = score.split('-')
        return '-'.join(parts[::-1])
    return score

if 'tiebreakScore' not in shot_data.columns or shot_data['tiebreakScore'].isnull().any():
    shot_data.loc[shot_data['gameScore'] == '6-6', 'tiebreakScore'] = shot_data['pointScore']
    # Apply reverse_point_score function where serverName is 'Player2' and tiebreakScore is not NaN
    shot_data.loc[
    (shot_data['serverName'] == player2_name) & (shot_data['tiebreakScore'].notna()), 
    'tiebreakScore'] = shot_data.loc[
    (shot_data['serverName'] == player2_name) & (shot_data['tiebreakScore'].notna()), 
    'tiebreakScore'
    ].apply(reverse_point_score)

# Set the pointScore to NaN where tiebreakScore is not NaN
shot_data.loc[pd.notna(shot_data['tiebreakScore']), 'pointScore'] = np.nan


#### returnerName Column

In [79]:
def get_returner_name(server_name):
    return player2_name if server_name == player1_name else player1_name

shot_data['returnerName'] = shot_data['serverName'].apply(get_returner_name)
print(f"Player 1 = {player1_name}, Player 2 = {player2_name}")

Player 1 = Kaylan Bigun, Player 2 = Miguel Perez Pena


#### shotHitBy Column

In [80]:
shot_data['shotHitBy'] = shot_data.apply(lambda row: row['serverName'] if row['shotInRally'] % 2 == 1 else row['returnerName'], axis=1)

#### isInsideOut and isInsideIn Column

In [81]:
shot_data['InsideOut'] = None
shot_data['InsideIn'] = None

def inside_out(hit_by, side, fhbh, direction):
    if hit_by == player1_name:
        player_hand = player1_hand
    else:
        player_hand = player2_hand

    if player_hand == "Right":
        if side == "Deuce" and fhbh == "Backhand" and direction == "Crosscourt":
            return 1
        elif side == "Ad" and fhbh == "Forehand" and direction == "Crosscourt":
            return 1
    else:
        if side == "Ad" and fhbh == "Backhand" and direction == "Crosscourt":
            return 1
        elif side == "Deuce" and fhbh == "Forehand" and direction == "Crosscourt":
            return 1        
            
def inside_in(hit_by, side, fhbh, direction):
    if hit_by == player1_name:
        player_hand = player1_hand
    else:
        player_hand = player2_hand

    if player_hand == "Right":
        if side == "Deuce" and fhbh == "Backhand" and direction == "Down the Line":
            return 1
        elif side == "Ad" and fhbh == "Forehand" and direction == "Down the Line":
            return 1
    else:
        if side == "Ad" and fhbh == "Backhand" and direction == "Down the Line":
            return 1
        elif side == "Deuce" and fhbh == "Forehand" and direction == "Down the Line":
            return 1        

shot_data['InsideOut'] = shot_data.apply(lambda x: inside_out(x['shotHitBy'], x['side'], x['shotFhBh'], x['shotDirection']), axis = 1)
shot_data['InsideIn'] = shot_data.apply(lambda x: inside_in(x['shotHitBy'], x['side'], x['shotFhBh'], x['shotDirection']), axis = 1)

#### isAce Column

In [82]:
# Add the Ace column
shot_data['isAce'] = None

for index, row in shot_data.iterrows():
    if row['isPointEnd'] == 1:
        if row['shotInRally'] == 1: # last point is serve
            if (row['firstServeIn'] == 1 or row['secondServeIn'] == 1): # either first or second serve went in
                shot_data.at[index, 'isAce'] = 1

#### isDoubleFault Column

In [83]:
# Add the DoubleFault column
shot_data['isDoubleFault'] = None

for index, row in shot_data.iterrows():
    if row['isPointEnd'] == 1:
        if row['shotInRally'] == 1: # last point is serve
            if (row['firstServeIn'] != 1 and row['secondServeIn'] != 1): # either first or second serve went in
                shot_data.at[index, 'isDoubleFault'] = 1

#### pointWonBy and lastShotError Columns

In [84]:
# Add the 'pointWonBy' column
shot_data['pointWonBy'] = None

# Add the 'lastShotError' column
shot_data['lastShotError'] = None

for index, row in shot_data.iterrows():
    if row['isPointEnd'] == 1:
        if row['shotInRally'] == 1: # last point is serve
            if row['isAce'] == 1: 
                shot_data.at[index, 'pointWonBy'] = row['serverName']
            elif row['isDoubleFault'] == 1: 
                shot_data.at[index, 'pointWonBy'] = row['returnerName']

                
        elif row['shotInRally'] != 1:
            if row['isErrorWideR'] == 1 or row['isErrorWideL'] == 1 or row['isErrorNet'] == 1 or row['isErrorLong'] == 1: # if error
                shot_data.at[index, 'lastShotError'] = 1
                
                if row['shotInRally'] % 2 == 0:
                    shot_data.at[index, 'pointWonBy'] = row['serverName']
                else:
                    shot_data.at[index, 'pointWonBy'] = row['returnerName']
        
            elif row['isWinner'] == 1:
                if row['shotInRally'] % 2 == 0:
                    shot_data.at[index, 'pointWonBy'] = row['returnerName']
                else:
                    shot_data.at[index, 'pointWonBy'] = row['serverName']

# Backward fill pointWonBy
shot_data['pointWonBy'].bfill();

#### serveResult and serveInPlacement Columns

In [85]:
conditions = [
    (shot_data['isPointStart'] == 1) & (shot_data['firstServeIn'] == 1),
    (shot_data['isPointStart'] == 1) & (shot_data['firstServeIn'] != 1) & (shot_data['secondServeIn'] == 1),
    (shot_data['isPointStart'] == 1) & (shot_data['firstServeIn'] != 1) & (shot_data['secondServeIn'] != 1),]

# Define the values to be assigned for each condition
values_result = ['1st Serve In', '2nd Serve In', 'Double Fault']
values_placement = [shot_data['firstServeZone'], shot_data['secondServeZone'], shot_data['secondServeZone']]

# Use numpy.select to assign values based on conditions
shot_data['serveResult'] = np.select(conditions, values_result, default='')
shot_data['serveInPlacement'] = np.select(conditions, values_placement, default='')

In [86]:
shot_data.replace('', None, inplace=True)

#### depth Column

In [87]:
def depth_metric(shotInRally, x, y, side):
    
    if (x >= -157.5) & (x <= 157.5):
    
        if side == 'Near':
            if shotInRally % 2 == 0:
                if -455 < y < -350: return 'Deep'
                if -350 < y < 0: return 'Short'
                if y < -455: return 'Long'

            elif shotInRally % 2 == 1:
                if 455 > y > 350: return 'Deep'
                if 0 < y < 350: return 'Short'
                if y > 455: return 'Long'

        elif side == 'Far':
            if shotInRally % 2 == 1:
                if -455 < y < -350: return 'Deep'
                if -350 < y < 0: return 'Short'
                if y < -455: return 'Long'

            elif shotInRally % 2 == 0:
                if 455 > y > 350: return 'Deep'
                if 0 < y < 350: return 'Short'
                if y > 455: return 'Long'
        
shot_data['depth'] = shot_data.apply(lambda x: depth_metric(x['shotInRally'], x['shotLocationX'], x['shotLocationY'], x['serverFarNear']), axis=1)

#### isApproach Column

In [88]:
# Future implementation form Leo's team using a Classification Model

#### atNetPlayer1 and atNetPlayer2 Columns

In [89]:
shot_data['atNetPlayer1'] = None
shot_data['atNetPlayer2'] = None


# Define the criteria for being at the player's net
def is_at_player_net(x, y):
    return 1 if -245 <= y <= 245 and -157.5 <= x <= 157.5 else ''

player1Name = shot_data['player1Name'].loc[0] 
player2Name = shot_data['player2Name'].loc[0]

# Apply the criteria based on the serverName
shot_data.loc[shot_data['shotHitBy'] == player1Name, 'atNetPlayer1'] = shot_data.apply(lambda row: is_at_player_net(row['shotContactX'], row['shotContactY']), axis=1)
shot_data.loc[shot_data['shotHitBy'] == player2Name, 'atNetPlayer2'] = shot_data.apply(lambda row: is_at_player_net(row['shotContactX'], row['shotContactY']), axis=1)

# **Output ShotCSV**

In [90]:
player1NameNoSpace = shot_data.iloc[0]['player1Name'].replace(" ", "")
player2NameNoSpace = shot_data.iloc[0]['player2Name'].replace(" ", "")

# Save csv
shot_data.to_csv(f'Shot_Visuals_{player1NameNoSpace}_{player2NameNoSpace}.csv', index=False)

# **Create PointCSV**

In [91]:
# Creating point_df (with only 1 row for each pointNumber)
point_df = shot_data.drop_duplicates(subset='pointNumber')[['pointNumber']]

# **Add Point CSV Columns**

#### player1Name and player2Name Columns

In [92]:
# Extract the first value of player1Name and player2Name from shot_data
player1_name = shot_data['player1Name'].iloc[0]
player2_name = shot_data['player2Name'].iloc[0]

# Fill in the first value into all rows of point_df['player1Name'] and point_df['player2Name']
point_df['player1Name'] = player1_name
point_df['player2Name'] = player2_name

#### Scores Columns

In [93]:
point_df['pointScore'] = shot_data.groupby('pointNumber')['pointScore'].first().values
point_df['gameScore'] = shot_data.groupby('pointNumber')['gameScore'].first().values
point_df['setScore'] = shot_data.groupby('pointNumber')['setScore'].first().values
point_df['tiebreakScore'] = shot_data.groupby('pointNumber')['tiebreakScore'].first().values

#### side Column

In [94]:
# Group shot_data by 'pointNumber' and get the first 'side' value for each group
side_values = shot_data.groupby('pointNumber')['side'].first().reset_index()
point_df['side'] = side_values['side'].values

#### serverName, returnerName, and team Columns

In [95]:
# Adds Server and Returner Names and pointScore
point_df['serverName'] = shot_data.groupby('pointNumber')['serverName'].first().values
point_df['returnerName'] = shot_data.groupby('pointNumber')['returnerName'].first().values

client_team_value = shot_data.loc[0, 'clientTeam']
opponent_team_value = shot_data.loc[0, 'opponentTeam']

point_df['clientTeam'] = client_team_value
point_df['opponentTeam'] = opponent_team_value

#### pointStartTime, pointEndPosition, and Duration Columns

In [96]:
# Add Start and End times per point
for index, row in shot_data.iterrows():
    point_number = row['pointNumber']
    
    if row['isPointStart'] == 1:
        point_df.loc[point_df['pointNumber'] == point_number, 'Position'] = row['pointStartTime']
    if row['isPointEnd'] == 1:
        point_df.loc[point_df['pointNumber'] == point_number, 'pointEndPosition'] = row['pointEndTime']

# Add Duration
point_df['Duration'] = point_df['pointEndPosition'] - point_df['Position']

#### rallyCount Column

In [97]:
# Find the highest shotInRally for each pointNumber in shot_data
max_rally_per_point = shot_data.groupby('pointNumber')['shotInRally'].max().reset_index()
point_df['rallyCount'] = list(max_rally_per_point['shotInRally'])

# Add 'rallyCountFreq' column 
point_df['rallyCountFreq'] = point_df['rallyCount'].apply(lambda x: '1-4' if 1 <= x <= 4 else 
                                                          ('5-8' if 5 <= x <= 8 else 
                                                           ('9-12' if 9 <= x <= 12 else 
                                                            ('13+' if x >= 13 else 'Error'))))

# Convert to Categorical with specific levels
point_df['rallyCountFreq'] = pd.Categorical(point_df['rallyCountFreq'], 
                                             categories=['1-4', '5-8', '9-12', '13+'], 
                                             ordered=True)

point_df['rallyCountFreq']


0       1-4
4       1-4
7       1-4
10      1-4
11      5-8
16     9-12
27     9-12
39      1-4
42      1-4
44      5-8
50     9-12
62      5-8
68      1-4
71      1-4
75      5-8
83      1-4
87     9-12
97      1-4
100    9-12
110     1-4
112     1-4
115    9-12
127     1-4
131     5-8
137     1-4
141     1-4
145     5-8
150     1-4
154     5-8
159     1-4
161     5-8
166     1-4
168     5-8
173     1-4
176     1-4
178     5-8
186     1-4
188     1-4
191     5-8
196    9-12
206     1-4
208    9-12
219     13+
232     1-4
234     1-4
237     1-4
238     5-8
243     5-8
250     1-4
254     1-4
256     1-4
258     5-8
265     1-4
267     1-4
269     5-8
275     1-4
279     1-4
280     5-8
288     1-4
291     1-4
295     1-4
297     1-4
299     1-4
303     5-8
309     1-4
313     1-4
315     5-8
323     1-4
325     1-4
329    9-12
338     1-4
341     1-4
343     1-4
345     5-8
353     1-4
355     1-4
357    9-12
366     1-4
368     1-4
372     1-4
376     1-4
380     1-4
384     1-4
386 

#### Serve Columns

In [98]:
point_df['firstServeIn'] = 0
point_df['secondServeIn'] = 0

for point_number in shot_data['pointNumber'].unique():
    # Assign firstServeIn 
    if any((shot_data['pointNumber'] == point_number) & (shot_data['firstServeIn'] == 1)):
        point_df.loc[point_df['pointNumber'] == point_number, 'firstServeIn'] = 1
    # Assign secondServeIn
    if any((shot_data['pointNumber'] == point_number) & (shot_data['secondServeIn'] == 1)):
        point_df.loc[point_df['pointNumber'] == point_number, 'secondServeIn'] = 1

# Add serveResult and serveInPlacement
start_points = shot_data[shot_data['isPointStart'] == 1]
point_df['serveResult'] = start_points['serveResult'].values
point_df['serveInPlacement'] = start_points['serveInPlacement'].values

# Add firstServeZone and secondServeZone
serve_zones = shot_data.loc[shot_data['shotInRally'] == 1, ['pointNumber', 'firstServeZone', 'secondServeZone', 'firstServeIn', 'secondServeIn']].drop_duplicates()
point_df['firstServeZone'] = shot_data.groupby('pointNumber')['firstServeZone'].first().values
point_df['secondServeZone'] = shot_data.groupby('pointNumber')['secondServeZone'].first().values

#### Ace Column

In [99]:
point_df['isAce'] = ((point_df['rallyCount'] == 1) & ((point_df['serveResult'] != "Double Fault")))

#### Server Coordinate Data Columns

In [100]:
# Add serverFarNear
point_df['serverFarNear'] = shot_data.groupby('pointNumber')['serverFarNear'].first().values

# Add firstServeXCoord and firstServeYCoord
point_df['firstServeXCoord'] = shot_data.groupby('pointNumber')['firstServeXCoord'].first().values
point_df['firstServeYCoord'] = shot_data.groupby('pointNumber')['firstServeYCoord'].first().values

# Add secondServeXCoord and secondServeYCoord
point_df['secondServeXCoord'] = shot_data.groupby('pointNumber')['secondServeXCoord'].first().values
point_df['secondServeYCoord'] = shot_data.groupby('pointNumber')['secondServeYCoord'].first().values

#### Server and Returner Start Location Columns
- **ignore for older tagging csvs that don't have serverStartLocation and returnerStartLocation columns**

In [101]:
# Add serverStartLocation and returnerStartLocation
#point_df['serverStartLocation'] = shot_data.groupby('pointNumber')['serverStartLocation'].first().values
#point_df['returnerStartLocation'] = shot_data.groupby('pointNumber')['returnerStartLocation'].first().values

#### Return Columns

In [102]:
point_df['returnDirection'] = None
point_df['returnFhBh'] = None

for point_number in shot_data['pointNumber'].unique():
    # shotInRally == 2 for returns
    if 2 in shot_data.loc[shot_data['pointNumber'] == point_number, 'shotInRally'].values:
        row_with_return_info = shot_data[(shot_data['pointNumber'] == point_number) & (shot_data['shotInRally'] == 2)].iloc[0]

        # Add/assign returnDirection and returnFhBh
        point_df.loc[point_df['pointNumber'] == point_number, 'returnDirection'] = row_with_return_info['shotDirection']
        point_df.loc[point_df['pointNumber'] == point_number, 'returnFhBh'] = row_with_return_info['shotFhBh']

#### errorType Column

In [103]:
# Create an empty DataFrame to store the results
error_results = pd.DataFrame(columns=['errorType', 'pointNumber'])

# Iterate through entire shot_data
for index, row in shot_data.iterrows():
    pointNumber = row['pointNumber']
    point_error_value = None
    
    if row['isErrorWideR'] == 1:
        point_error_value = 'Wide Right'
    elif row['isErrorWideL'] == 1:
        point_error_value = 'Wide Left'
    elif 'isErrorNet' in row and row['isErrorNet'] == 1:
        point_error_value = 'Net'
    elif row['isErrorLong'] == 1:
        point_error_value = 'Long'
    

    # If an error is found, append the result to the error_results DataFrame
    if point_error_value is not None:
        error_results = pd.concat([error_results, pd.DataFrame({'pointNumber': [pointNumber], 'errorType': [point_error_value]})], ignore_index=True)


# Drop duplicates based on 'pointNumber'
error_results = error_results.drop_duplicates(subset=['pointNumber'])

In [104]:
# Create a dictionary mapping 'pointNumber' to 'errorType' in error_results
error_type_mapping = dict(zip(error_results['pointNumber'], error_results['errorType']))

# Create 'errorType' column in point_df based on the mapping
point_df['errorType'] = point_df['pointNumber'].map(error_type_mapping)

point_df = point_df.replace({np.nan: None})

#### returnError Column

In [105]:
def get_return_error(row):
    if row['rallyCount'] == 2:
        return row['errorType']
    else:
        return None

point_df.loc[point_df['pointNumber'] == point_number, 'serveInPlacement'] = shot_data['secondServeZone']  

# Apply the functions to create the new columns
point_df['returnError'] = point_df.apply(get_return_error, axis=1)


#### lastShot Columns

In [106]:
point_df['lastShotDirection'] = None
point_df['lastShotFhBh'] = None
point_df['lastShotHitBy'] = None  
point_df['lastShotResult'] = None  

# Iterate through unique pointNumbers in shot_data
for point_number in shot_data['pointNumber'].unique():
    # Check if isPointEnd == 1 exists for the given pointNumber
    if 1 in shot_data.loc[shot_data['pointNumber'] == point_number, 'isPointEnd'].values:
        # Get the information from the corresponding row
        row_with_lastshot_info = shot_data[(shot_data['pointNumber'] == point_number) & (shot_data['isPointEnd'] == 1)].iloc[0]

        # Assign values to 'lastShotDirection' and 'lastShotFhBh' columns
        point_df.loc[point_df['pointNumber'] == point_number, 'lastShotDirection'] = row_with_lastshot_info['shotDirection']
        point_df.loc[point_df['pointNumber'] == point_number, 'lastShotFhBh'] = row_with_lastshot_info['shotFhBh']
        point_df.loc[point_df['pointNumber'] == point_number, 'lastShotHitBy'] = row_with_lastshot_info['shotHitBy']
        
        # Determine lastShotResult based on conditions
        if row_with_lastshot_info['isWinner'] == 1 and not row_with_lastshot_info['isAce']:
            point_df.loc[point_df['pointNumber'] == point_number, 'lastShotResult'] = "Winner"
        elif row_with_lastshot_info['lastShotError'] == 1:
            point_df.loc[point_df['pointNumber'] == point_number, 'lastShotResult'] = "Error"

#### pointWonBy Column

In [107]:
# Initialize variables to keep track of the state
prev_point_number = None
point_won_by_list = []

# Iterate through the DataFrame
for index, row in shot_data.iterrows():
    if row['isPointEnd'] == 1:
        # Check if pointNumber is different and consecutively increasing
        if prev_point_number is None or row['pointNumber'] == prev_point_number + 1:
            # Append pointWonBy to the list
            point_won_by_list.append(row['pointWonBy'])
            prev_point_number = row['pointNumber']
        else:
            raise ValueError("Error: Point numbers are not different or consecutively increasing.")
            break

# Add point_won_by_list as a new column to point_df
point_df['pointWonBy'] = point_won_by_list

#### isExcitingPoint Column

In [108]:
point_df['isExcitingPoint'] = shot_data.groupby('pointNumber')['isExcitingPoint'].count()

#### isBreakPoint Column

In [109]:
break_point_values = ['0-40', '15-40', '30-40', '40-40']
point_df['isBreakPoint'] = point_df['pointScore'].isin(break_point_values)

#### atNetPlayer1 and atNetPlayer2 Columns

In [110]:
for i in point_numbers:
    # atNetPlayer1
    if any((shot_data['pointNumber'] == i) & (shot_data['atNetPlayer1'] == 1)):
        point_df.loc[point_df['pointNumber'] == i, 'atNetPlayer1'] = 1
    # atNetPlayer2
    if any((shot_data['pointNumber'] == i) & (shot_data['atNetPlayer2'] == 1)):
        point_df.loc[point_df['pointNumber'] == i, 'atNetPlayer2'] = 1

# Add atNetPlayer Columns
point_df['atNetPlayer1'] = point_df['atNetPlayer1'].replace({0: "", 1: player1_name})
point_df['atNetPlayer2'] = point_df['atNetPlayer2'].replace({0: "", 1: player2_name})

#### setNum Column

In [111]:
point_df['setNum'] = point_df['setScore'].apply(lambda x: sum(int(char) for char in x if char.isdigit()) + 1)

#### Depths Count (Short, Deep) Columns
- NEED TO DO: Group by pointNumber and sum the counts of Deep, Short, and Long --> put into columns deepCount, and shortCount

In [112]:
# Add counts for each player on how many short and deep balls they hit in the point (group by pointNumber)
# - don't have to do long since we already know what points end with isErrorLong (and also isErrorNet)


shot_data['deep'] = np.where(shot_data['depth'] == 'Deep', 1, 0)
shot_data['short'] = np.where(shot_data['depth'] == 'Short', 1, 0)


#point_df['deepCountPlayer1'] = shot_data.groupby(['pointNumber', 'player1Name'])['deep'].sum().values
#point_df['shortCountPlayer1'] = shot_data.groupby(['pointNumber', 'player1Name'])['short'].sum().values

#point_df['deepCountPlayer2'] = shot_data.groupby(['pointNumber', 'player2Name'])['deep'].sum().values
#point_df['shortCountPlayer2'] = shot_data.groupby(['pointNumber', 'player2Name'])['short'].sum().values



### attempt 2:

deep_group = shot_data.pivot_table(index='pointNumber', columns='shotHitBy', values='deep', aggfunc='sum')
deep_group.columns = ['deepCountPlayer1', 'deepCountPlayer2']
# change NaN's from double faults to 0?
deep_group.fillna(0, inplace=True)


short_group = shot_data.pivot_table(index='pointNumber', columns='shotHitBy', values='short', aggfunc='sum')
short_group.columns = ['shortCountPlayer1', 'shortCountPlayer2']
# change NaN's from double faults to 0?
short_group.fillna(0, inplace=True)


point_df = pd.merge(point_df, deep_group, how='left', on='pointNumber')
point_df = pd.merge(point_df, short_group, how='left', on='pointNumber')

### Add Column: Game Number, Set Number, Game/Set/Point for each player
- **NEED TO DO: Fix pointscore: player1PointScore and player2PointScore change based on side; eg. suppose score is "15-0" player2 serving, currently it just takes 15 and assigns to player1PointScore**

In [113]:
point_df[['player1SetScore', 'player2SetScore']] = point_df['setScore'].str.split('-', expand=True)
point_df[['player1GameScore', 'player2GameScore']] = point_df['gameScore'].str.split('-', expand=True)
point_df[['player1PointScore', 'player2PointScore']] = point_df['pointScore'].str.split('-', expand=True) # NEED TO FIX
if not point_df['tiebreakScore'].isnull().all() and not point_df['tiebreakScore'].eq("").all():
    # Perform the operation only when tiebreakScore is not empty
    point_df[['player1TiebreakScore', 'player2TiebreakScore']] = point_df['tiebreakScore'].str.split('-', expand=True)
else:
    # Set player1TiebreakScore and player2TiebreakScore to NaN
    point_df['player1TiebreakScore'] = np.nan
    point_df['player2TiebreakScore'] = np.nan
    
def calculate_game_number(score):
    return int(score.split('-')[0]) + int(score.split('-')[1]) + 1

# Apply the function to create the 'gameNumber' column
point_df['gameNumber'] = point_df['gameScore'].apply(calculate_game_number)

#### player1ServeResult Column

In [114]:
# Add the 'player1ServeResult' column
point_df['player1ServeResult'] = None

point_df.loc[point_df['serverName'] == point_df['player1Name'], 'player1ServeResult'] = point_df['serveResult']
point_df.loc[point_df['isAce'] == True, 'player1ServeResult'] = 'Ace'

#### player1ServePlacement Column

In [115]:
# Add the 'player1ServePlacement' column
point_df['player1ServePlacement'] = None
point_df.loc[point_df['serverName'] == point_df['player1Name'], 'player1ServePlacement'] = point_df['side'] + ': ' + point_df['serveInPlacement']

#### player1ReturnPlacement Column

In [116]:
# Add the 'player1ReturnPlacement' column
point_df['player1ReturnPlacement'] = None

# Set player1ServePlacement based on conditions
point_df.loc[point_df['returnerName'] == point_df['player1Name'], 'player1ReturnPlacement'] = point_df['returnDirection']

#### player1ReturnFhBh Column

In [117]:
# Add the 'player1ReturnFhBh' column
point_df['player1ReturnFhBh'] = None

# Set player1ServePlacement based on conditions
point_df.loc[point_df['returnerName'] == point_df['player1Name'], 'player1ReturnFhBh'] = point_df['returnFhBh']

#### player1LastShotPlacement Column

In [118]:
# Add the 'player1LastShotFhBh' column
point_df['player1LastShotPlacement'] = None

# Set player1ServePlacement based on conditions
point_df.loc[point_df['lastShotHitBy'] == point_df['player1Name'], 'player1LastShotPlacement'] = point_df['lastShotDirection']

#### player1LastShotFhBh Column

In [119]:
# Add the 'player1LastShotFhBh' column
point_df['player1LastShotFhBh'] = None

# Set player1ServePlacement based on conditions
point_df.loc[point_df['lastShotHitBy'] == point_df['player1Name'], 'player1LastShotFhBh'] = point_df['lastShotFhBh']

#### player1LastShotResult Column

In [120]:
# Add the 'player1LastShotResult' column
point_df['player1LastShotResult'] = None

# Set player1LastShotResult based on conditions, excluding 'Ace' and 'Double Fault'
point_df.loc[
    (point_df['lastShotHitBy'] == point_df['player1Name']) & 
    ~point_df['player1ServeResult'].isin(['Ace', 'Double Fault']), 
    'player1LastShotResult'
] = point_df['lastShotResult']


#### player2ServeResult Column

In [121]:
# Add the 'player2ServeResult' column
point_df['player2ServeResult'] = None

# Set player1ServeResult based on conditions
point_df.loc[point_df['serverName'] == point_df['player2Name'], 'player2ServeResult'] = point_df['serveResult']
point_df.loc[point_df['isAce'] == True, 'player1ServeResult'] = 'Ace'

#### player2ServePlacement Column

In [122]:
# Add the 'player1ServePlacement' column
point_df['player2ServePlacement'] = None

# Set player1ServePlacement based on conditions
point_df.loc[point_df['serverName'] == point_df['player2Name'], 'player2ServePlacement'] = point_df['side'] + ': ' + point_df['serveInPlacement']

#### player2ReturnPlacement Column

In [123]:
# Add the 'player2ReturnPlacement' column
point_df['player2ReturnPlacement'] = None

# Set player1ServePlacement based on conditions
point_df.loc[point_df['returnerName'] == point_df['player2Name'], 'player2ReturnPlacement'] = point_df['returnDirection']

#### player2ReturnFhBh Column

In [124]:
# Add the 'player1ReturnFhBh' column
point_df['player2ReturnFhBh'] = None

# Set player1ServePlacement based on conditions
point_df.loc[point_df['returnerName'] == point_df['player2Name'], 'player2ReturnFhBh'] = point_df['returnFhBh']

#### player2LastShotPlacement Column

In [125]:
# Add the 'player1LastShotFhBh' column
point_df['player2LastShotPlacement'] = None

# Set player1ServePlacement based on conditions
point_df.loc[point_df['lastShotHitBy'] == point_df['player2Name'], 'player2LastShotPlacement'] = point_df['lastShotDirection']

#### player2LastShotFhBh Column

In [126]:
# Add the 'player1LastShotFhBh' column
point_df['player2LastShotFhBh'] = None

# Set player1ServePlacement based on conditions
point_df.loc[point_df['lastShotHitBy'] == point_df['player2Name'], 'player2LastShotFhBh'] = point_df['lastShotFhBh']

#### player2LastShotResult Column

In [127]:
# Add the 'player2LastShotResult' column
point_df['player2LastShotResult'] = None

# Set player1LastShotResult based on conditions, excluding 'Ace' and 'Double Fault'
point_df.loc[
    (point_df['lastShotHitBy'] == point_df['player2Name']) &
    ~point_df['player2ServeResult'].isin(['Ace', 'Double Fault']),
    'player2LastShotResult'
] = point_df['lastShotResult']


#### Name Column

In [128]:
# Change pointScore to the specified format
point_df['Name'] = point_df.apply(lambda row: f"Set {row['setNum']}: {row['gameScore']}, {row['tiebreakScore']} {row['serverName']} Serving" if pd.notna(row['tiebreakScore']) else f"Set {row['setNum']}: {row['gameScore']}, {row['pointScore']} {row['serverName']} Serving", axis=1)

#### Reorder DataFrame for Output
- **NEED TO DO: Update this with new columns from Leo (firstServeLocation and isLet)**

In [129]:
desired_order = ['pointNumber', 'player1Name', 'player2Name', 'pointScore', 'gameScore',
       'setScore', 'tiebreakScore', 'side', 'serverName', 'returnerName',
       'clientTeam', 'opponentTeam', 'Position', 'pointEndPosition',
       'Duration', 'rallyCount', 'rallyCountFreq', 'firstServeIn',
       'secondServeIn', 'serveResult', 'serveInPlacement', 'firstServeZone',
       'secondServeZone', 'isAce', 'serverFarNear', 'serverStartLocation', 'returnerStartLocation', 
       'firstServeXCoord','firstServeYCoord', 'secondServeXCoord', 'secondServeYCoord',
       'returnDirection', 'returnFhBh', 'errorType', 'returnError',
       'lastShotDirection', 'lastShotFhBh', 'lastShotHitBy', 'lastShotResult',
       'pointWonBy', 'isExcitingPoint', 'isBreakPoint', 'atNetPlayer1',
       'atNetPlayer2', 'setNum', 'player1SetScore', 'player2SetScore',
       'player1GameScore', 'player2GameScore', 'player1PointScore',
       'player2PointScore', 'player1TiebreakScore', 'player2TiebreakScore',
       'gameNumber', 'player1ServeResult', 'player1ServePlacement',
       'player1ReturnPlacement', 'player1ReturnFhBh',
       'player1LastShotPlacement', 'player1LastShotFhBh',
       'player1LastShotResult', 'player2ServeResult', 'player2ServePlacement',
       'player2ReturnPlacement', 'player2ReturnFhBh',
       'player2LastShotPlacement', 'player2LastShotFhBh',
       'player2LastShotResult', 'deepCountPlayer1', 'deepCountPlayer2',
       'shortCountPlayer1', 'shortCountPlayer2', 'Name']

# Reorder the columns
point_df = point_df.reindex(columns=desired_order)

In [130]:
point_df_copy = point_df.copy()

# **Point CSV Error Checks**
#### Check Game Number is consecutive increasing. Ex: 1,2,3,4,5,6. End of Set 1. 1,2,3,4,5,6,7,8

In [131]:
game_numbers = point_df['gameNumber'].tolist()

# Initialize variables
seen = set()
prev = None

# Iterate through gameNumber column
for num in game_numbers:
    # If the number is not in seen or it's different from the previous one, print it
    if num not in seen or num != prev:
        print(num, end=', ')
    # If the number is the same as the previous one but not consecutive, print it
    elif num == prev and num not in seen:
        print(num, end=', ')
    # Update seen set and prev variable
    seen.add(num)
    prev = num

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 

#### Check if the columns and their order are the same

In [132]:
print(point_df.shape)
print(point_df_copy.shape)

if (point_df.shape == point_df_copy.shape):
    print('Check passed.')
else:
    raise ValueError('Error: Not the same!')

(112, 73)
(112, 73)
Check passed.


In [133]:
# Get the set of column names for each DataFrame
point_df_columns = set(point_df.columns)
point_df_copy_columns = set(point_df_copy.columns)

# Find the column names unique to each DataFrame
unique_to_point_df = point_df_columns - point_df_copy_columns
unique_to_point_df_copy = point_df_copy_columns - point_df_columns

# Output the results
if unique_to_point_df:
    print("Columns unique to point_df:", unique_to_point_df)
else:
    print("All columns in point_df are also in point_df_copy")

if unique_to_point_df_copy:
    print("Columns unique to point_df_copy:", unique_to_point_df_copy)
else:
    print("All columns in point_df_copy are also in point_df")


All columns in point_df are also in point_df_copy
All columns in point_df_copy are also in point_df


## group by game number and output unique values of serverName and check if > 1

In [157]:
game_number = point_df.groupby('gameNumber')
print(point_df_columns)
server_series = game_number.serverName.unique()
print(server_series)
if(len(server_series) > 1):
   raise ValueError('Error: more than one unique server name')


{'isBreakPoint', 'firstServeIn', 'atNetPlayer2', 'tiebreakScore', 'clientTeam', 'Duration', 'player2ReturnFhBh', 'secondServeXCoord', 'player2ServePlacement', 'pointScore', 'rallyCountFreq', 'setNum', 'player1LastShotResult', 'returnerStartLocation', 'lastShotDirection', 'player1GameScore', 'player2LastShotFhBh', 'firstServeZone', 'returnFhBh', 'lastShotResult', 'deepCountPlayer2', 'secondServeZone', 'serveResult', 'lastShotFhBh', 'serveInPlacement', 'player2Name', 'player2TiebreakScore', 'player2ServeResult', 'firstServeXCoord', 'firstServeYCoord', 'isExcitingPoint', 'serverName', 'returnError', 'player1SetScore', 'player1ServeResult', 'serverStartLocation', 'player2ReturnPlacement', 'player2LastShotResult', 'lastShotHitBy', 'player1TiebreakScore', 'shortCountPlayer2', 'returnerName', 'atNetPlayer1', 'pointNumber', 'player2LastShotPlacement', 'player1LastShotFhBh', 'player2SetScore', 'gameScore', 'pointWonBy', 'side', 'Position', 'errorType', 'player1LastShotPlacement', 'player1Name',

ValueError: Error: more than one unique server name

#### Change all empty cells to ""

In [134]:
point_df.replace([pd.NA, None, pd.NaT, float('nan')], "", inplace=True)

# **Output PointCSV**

In [135]:
player1NameNoSpace = point_df.iloc[0]['player1Name'].replace(" ", "")
player2NameNoSpace = point_df.iloc[0]['player2Name'].replace(" ", "")

# Save DataFrame to CSV file with modified player names
point_df.to_csv(f'Point_Visuals_{player1NameNoSpace}_{player2NameNoSpace}.csv', index=False)

# Function to change csv to json
def csv_to_json(csv_file_path, json_file_name):
    df = pd.read_csv(csv_file_path)
    json_data = df.to_json(orient='records')
    json_file_path = f'{json_file_name}'
    with open(json_file_path, 'w') as json_file:
        json_file.write(json_data)
    
    return json_file_path

# Convert CSV to JSON and save in the same directory
csv_file_path = f"Point_Visuals_{player1NameNoSpace}_{player2NameNoSpace}.csv"
json_file_name = f"Point_Visuals_{player1NameNoSpace}_{player2NameNoSpace}.json"
csv_to_json(csv_file_path, json_file_name)

'Point_Visuals_KaylanBigun_MiguelPerezPena.json'

# **EDA**

### Shot CSV EDA

In [136]:
first_player1Name = point_df['player1Name'].iloc[0]
first_player2Name = point_df['player2Name'].iloc[0]

### Summary Stats

In [137]:
shot_eda = shot_data.copy()
point_df_eda = point_df.copy()

# Can input CSV Directly here for statistics functions
## [Depth EDA]
# Filter shots for Player1:
print(f"\n\nShot Results for {first_player1Name} for match overall:")

player1_shots = shot_eda[(shot_eda['shotHitBy'] == player1Name) & (shot_eda['lastShotError'] != 1) & (shot_eda['shotInRally'] != 1)]
#player1_shots = player1_shots[player1_shots['shotInRally'] != 1]
#print(player1_shots.head(10))
num_player1_shots = len(player1_shots)
num_deep_player1_shots = player1_shots['deep'].sum()
num_short_player1_shots = player1_shots['short'].sum()

print(f"Number of Deep Shots (count): {num_deep_player1_shots}" )
print(f"Number of Short Shots (count): {num_short_player1_shots}" )

# average of ALL shots in match for Player1 (including errors/long shots but excluding serves)
print(f"Number of Deep Shots (%): {num_deep_player1_shots / num_player1_shots:.2f}%")
print(f"Number of Short Shots (%): {num_short_player1_shots / num_player1_shots:.2f}%")


# separate into forehand, backhand, slice, and volley (all sepearate from each other):
# [remove slice and volleys from fh and bh count] ??

fh_shots = player1_shots[(player1_shots['shotFhBh'] == 'Forehand')] #& (player1_shots['isSlice'] != 1) & (player1_shots['isVolley'] != 1)]
bh_shots = player1_shots[(player1_shots['shotFhBh'] == 'Backhand')] #& (player1_shots['isSlice'] != 1) & (player1_shots['isVolley'] != 1)]
slice_shots = player1_shots[player1_shots['isSlice'] == 1]
volley_shots = player1_shots[player1_shots['isVolley'] == 1]

fh_slice = player1_shots[(player1_shots['shotFhBh'] == 'Forehand') & (player1_shots['isSlice'] == 1)]
bh_slice = player1_shots[(player1_shots['shotFhBh'] == 'Backhand') & (player1_shots['isSlice'] == 1)]
fh_volley = player1_shots[(player1_shots['shotFhBh'] == 'Forehand') & (player1_shots['isVolley'] == 1)]
bh_volley = player1_shots[(player1_shots['shotFhBh'] == 'Backhand') & (player1_shots['isVolley'] == 1)]

    
print(f"\nTotal number of Forehands (count): {len(fh_shots)}" )
print(f"Forehands Deep (count): {fh_shots['deep'].sum()}" )
print(f"Forehands Short (count): {fh_shots['short'].sum()}" )
print(f"Forehands Deep (%): {( fh_shots['deep'].sum() / len(fh_shots) ):.2f}%")
print(f"Forehands Short (%): {( fh_shots['short'].sum() / len(fh_shots) ):.2f}%")


print(f"\nTotal number of Backhands (count): {len(bh_shots)}" )
print(f"Backhands Deep (count): {bh_shots['deep'].sum()}" )
print(f"Backhands Short (count): {bh_shots['short'].sum()}" )
print(f"Backhands Deep (%): {( bh_shots['deep'].sum() / len(bh_shots) ):.2f}%")
print(f"Backhands Short (%): {( bh_shots['short'].sum() / len(bh_shots) ):.2f}%")


print(f"\nTotal number of Slices (count): {len(slice_shots)}" )
print(f"Slices Deep (count): {slice_shots['deep'].sum()}" )
print(f"Slices Short (count): {slice_shots['short'].sum()}" )
print(f"Slices Deep (%): {( slice_shots['deep'].sum() / len(slice_shots) ):.2f}%")
print(f"Slices Short (%): {( slice_shots['short'].sum() / len(slice_shots) ):.2f}%")

print(f"Slices Deep Forehand(%): {(fh_slice[('deep')].sum() / len(slice_shots) ):.2f}%")
print(f"Slices Short Forehand (%): {(fh_slice[('short')].sum()/ len(slice_shots) ):.2f}%")
print(f"Slices Deep Backhand (%): {(bh_slice[('deep')].sum()/ len(slice_shots) ):.2f}")
print(f"Slices Short Backhand (%): {(bh_slice[('short')].sum()/ len(slice_shots) ):.2f}")


print(f"\nTotal number of Volleys (count): {len(volley_shots)}" )
print(f"Volleys Deep (count): {volley_shots['deep'].sum()}" )
print(f"Volleys Short (count): {volley_shots['short'].sum()}" )
print(f"Volleys Deep (%): {( volley_shots['deep'].sum() / len(volley_shots) ):.2f}%")
print(f"Volleys Short (%): {( volley_shots['short'].sum() / len(volley_shots) ):.2f}%")

print(f"Volleys Deep Forehand (%): {(fh_volley[('deep')].sum()/ len(volley_shots) ):.2f}")
print(f"Volleys Short Forehand (%): {(fh_volley[('short')].sum()/ len(volley_shots) ):.2f}")
print(f"Volleys Deep Backhand (%): {(bh_volley[('deep')].sum()/ len(volley_shots) ):.2f}")
print(f"Volleys Short Backhand (%): {(bh_volley[('short')].sum()/ len(volley_shots) ):.2f}")
# Filter shots for Player1:
print(f"\n\nShot Results for {first_player1Name} per point:")

print(f"Average Deep Shots (%): {(point_df_eda['deepCountPlayer1'].sum() / len(point_df_eda)):.2f}%")
print(f"Average Short Shots (%): {(point_df_eda['shortCountPlayer1'].sum() / len(point_df_eda)):.2f}%")


# Approach Shots
player1Name = shot_eda.iloc[0]['player1Name']

# Filter shot_data based on the conditions
approach_data_player1 = shot_eda[(shot_eda['isApproach'] == 1) & (shot_eda['shotHitBy'] == player1Name)]

# Count the distinct pointNumbers
distinct_point_numbers = approach_data_player1['pointNumber'].nunique()

# Print the result
print(f"Number of Approach Shots hit by {player1Name}: {distinct_point_numbers}" )

# print(approach_data_player1)



#print(point_df_eda.columns)
#point_df_eda
#player1_shots = shot_eda[shot_eda['shotHitBy'] == player1Name]

# fix above code: make avg in each point, then avg the avg?
# your_file_name = "filename.csv"
# shot_eda = pd.read_csv(your_file_name)



Shot Results for Kaylan Bigun for match overall:
Number of Deep Shots (count): 82
Number of Short Shots (count): 104
Number of Deep Shots (%): 0.44%
Number of Short Shots (%): 0.55%

Total number of Forehands (count): 57
Forehands Deep (count): 28
Forehands Short (count): 28
Forehands Deep (%): 0.49%
Forehands Short (%): 0.49%

Total number of Backhands (count): 123
Backhands Deep (count): 53
Backhands Short (count): 70
Backhands Deep (%): 0.43%
Backhands Short (%): 0.57%

Total number of Slices (count): 6
Slices Deep (count): 1
Slices Short (count): 5
Slices Deep (%): 0.17%
Slices Short (%): 0.83%
Slices Deep Forehand(%): 0.17%
Slices Short Forehand (%): 0.67%
Slices Deep Backhand (%): 0.00
Slices Short Backhand (%): 0.17

Total number of Volleys (count): 8
Volleys Deep (count): 1
Volleys Short (count): 6
Volleys Deep (%): 0.12%
Volleys Short (%): 0.75%
Volleys Deep Forehand (%): 0.00
Volleys Short Forehand (%): 0.00
Volleys Deep Backhand (%): 0.00
Volleys Short Backhand (%): 0.00



### Point CSV EDA

#### Serve and Return Stats

In [138]:
first_player1Name = point_df_eda['player1Name'].iloc[0]



# Display the results
print(f"\nServe Results for {first_player1Name}:")

# Assuming point_df_eda is your DataFrame
total_serves = len(point_df_eda[point_df_eda['serverName'] == first_player1Name])
first_serve_in_count = len(point_df_eda[(point_df_eda['serverName'] == first_player1Name) & (point_df_eda['firstServeIn'] == 1)])
first_serve_won_count = len(point_df_eda[(point_df_eda['serverName'] == first_player1Name) & (point_df_eda['firstServeIn'] == 1) & (point_df_eda['pointWonBy'] == first_player1Name)])
percentage_first_serve_in = (first_serve_in_count / total_serves) * 100 if total_serves > 0 else 0
percentage_first_serve_won = (first_serve_won_count / first_serve_in_count) * 100 if first_serve_in_count > 0 else 0

second_serve_total_count = len(point_df_eda[(point_df_eda['serverName'] == first_player1Name) & (point_df_eda['firstServeIn'] == 0)])
second_serve_in_count = len(point_df_eda[(point_df_eda['serverName'] == first_player1Name) & (point_df_eda['firstServeIn'] == 0)& (point_df_eda['secondServeIn'] == 1)])
second_serve_won_count = len(point_df_eda[(point_df_eda['serverName'] == first_player1Name) & (point_df_eda['firstServeIn'] == 0)& (point_df_eda['secondServeIn'] == 1) & (point_df_eda['pointWonBy'] == first_player1Name)])
percentage_second_serve_in = (second_serve_in_count / second_serve_total_count) * 100 if second_serve_total_count > 0 else 0
percentage_second_serve_won = (second_serve_won_count / second_serve_in_count) * 100 if second_serve_in_count > 0 else 0



# Display the results
print("\nTotal Serves:", total_serves)
print("First Serve In (Count):", first_serve_in_count)
print("First Serve Won (Count):", first_serve_won_count)
print(f"First Serve In (%): {percentage_first_serve_in:.2f}%")
print(f"First Serve Won (%): {percentage_first_serve_won:.2f}%")

print("Second Serve In (Count):", second_serve_in_count)
print("Second Serve Total (Count):", second_serve_total_count)
print("Second Serve Won (Count):", second_serve_won_count)
print(f"Second Serve In (%): {percentage_second_serve_in:.2f}%")
print(f"Second Serve Won (%): {percentage_second_serve_won:.2f}%")

# Assuming point_df is your DataFrame
count_is_ace = (point_df_eda[point_df_eda['serverName'] == first_player1Name]['isAce']).sum()
count_is_double_fault = ((point_df_eda['serverName'] == first_player1Name) & (point_df_eda['serveResult'] == "Double Fault")).sum()

# Display the results
print("Ace (Count):", count_is_ace)
print("Double Fault (Count):", count_is_double_fault)

# Count of rows where serverName is equal to the first row of player1Name and pointWonBy is equal to the first row of player1Name
total_service_points_won = len(point_df_eda[(point_df_eda['serverName'] == first_player1Name) & (point_df_eda['pointWonBy'] == first_player1Name)])
total_service_points_won_percentage = total_service_points_won / total_serves *100

# Display the results
print(f"Points Won on Serve (Count) {total_service_points_won}")

print(f"Points Won on Serve (%): {total_service_points_won_percentage:.2f}%")

# Assuming point_df is your DataFrame
return_points = point_df_eda[(point_df_eda['returnerName'] == first_player1Name)] # CHANGED THIS JERRY, REMOVED RALLY COUNT >= 2

total_return = len(return_points)
returnMade = len(return_points[(return_points['rallyCount'] > 2) | ((return_points['rallyCount'] == 2) & (return_points['lastShotResult'] != 'Error'))])
returnError = len(return_points[(return_points['lastShotResult'] == 'Error') & (return_points['rallyCount'] == 2)])
returnWinner = len(return_points[(return_points['lastShotResult'] == 'Winner') & (return_points['rallyCount'] == 2)])
returnMadePercentage = returnMade/total_return

returnWonByPlayer1 = len(return_points[return_points['pointWonBy'] == first_player1Name])
returnWonByPlayer1Percentage = returnWonByPlayer1 / returnMade * 100 if returnMade > 0 else 0

deuceReturnCount = len(return_points[return_points['side'] == 'Deuce'])
adReturnCount = len(return_points[return_points['side'] == 'Ad'])


deuceReturnMade = len(return_points[(return_points['side'] == 'Deuce') & ((return_points['rallyCount'] > 2) | ((return_points['rallyCount'] == 2) & (return_points['lastShotResult'] != 'Error')))])
adReturnMade = len(return_points[(return_points['side'] == 'Ad') & ((return_points['rallyCount'] > 2) | ((return_points['rallyCount'] == 2) & (return_points['lastShotResult'] != 'Error')))])

deuceReturnMadePercentage = deuceReturnMade/deuceReturnCount
adReturnMadePercentage = adReturnMade/adReturnCount

deuceReturnWonByPlayer1 = len(return_points[(return_points['side'] == 'Deuce') & (return_points['pointWonBy'] == first_player1Name) | ((return_points['rallyCount'] == 2) & (return_points['lastShotResult'] != 'Error'))])
adReturnWonByPlayer1 = len(return_points[(return_points['side'] == 'Ad') & (return_points['pointWonBy'] == first_player1Name) | ((return_points['rallyCount'] == 2) & (return_points['lastShotResult'] != 'Error'))])

deuceReturnWonByPlayer1Percentage = deuceReturnWonByPlayer1 / deuceReturnMade * 100 if deuceReturnMade > 0 else 0
adReturnWonByPlayer1Percentage = adReturnWonByPlayer1 / adReturnMade * 100 if adReturnMade > 0 else 0




print(f"\nReturn Results for {first_player1Name}:\n")

print("Total Return (Count):", total_return)
print("Return Won (Count):", returnWonByPlayer1)
print("Return Won (%):", returnWonByPlayer1Percentage)

print("\nReturn Made (Count):", returnMade)
print("Return Made (%):", returnMadePercentage)
print("Return Error (Count):", returnError)
print("Return Winner (Count):", returnWinner)

print("\nDeuce Return (Count):", deuceReturnCount)
print("Deuce Return Made (Count):", deuceReturnMade)
print("Deuce Return Made (%):", deuceReturnMadePercentage)
print("Deuce Return Won by Player1 (%):", deuceReturnWonByPlayer1Percentage)
print("Deuce Return Won by Player1 (Count):", deuceReturnWonByPlayer1)


print("\nAd Return (Count):", adReturnCount)
print("Ad Return Made (Count):", adReturnMade)
print("Ad Return Made (%):", adReturnMadePercentage)
print("Ad Return Won by Player1 (Count):", adReturnWonByPlayer1)
print("Ad Return Won by Player1 (%):", adReturnWonByPlayer1Percentage)

# Assuming return_points is your DataFrame
deuce_return_points = return_points[(return_points['side'] == 'Deuce') & (return_points['returnerName'] == first_player1Name) & (return_points['rallyCount'] >= 2)]

# Deuce Return Points Separated by returnFhBh
deuce_forehand_return_points = deuce_return_points[deuce_return_points['returnFhBh'] == 'Forehand']
deuce_backhand_return_points = deuce_return_points[deuce_return_points['returnFhBh'] == 'Backhand']


# Count for Deuce Return Points - Made
count_deuce_forehand_made = len(deuce_forehand_return_points[(deuce_forehand_return_points['rallyCount'] > 2) | ((deuce_forehand_return_points['rallyCount'] == 2) & (deuce_forehand_return_points['lastShotResult'] != 'Error'))])
count_deuce_backhand_made = len(deuce_backhand_return_points[(deuce_backhand_return_points['rallyCount'] > 2) | ((deuce_backhand_return_points['rallyCount'] == 2) & (deuce_backhand_return_points['lastShotResult'] != 'Error'))])

# Count for Deuce Return Points - Error
count_deuce_forehand_error = len(deuce_forehand_return_points[(deuce_forehand_return_points['lastShotResult'] == 'Error') & (deuce_forehand_return_points['rallyCount'] == 2)])
count_deuce_backhand_error = len(deuce_backhand_return_points[(deuce_backhand_return_points['lastShotResult'] == 'Error') & (deuce_backhand_return_points['rallyCount'] == 2)])

# Display the counts
print("\nDeuce Forehand Return Points - Made:", count_deuce_forehand_made)
print("Deuce Forehand Return Points - Error:", count_deuce_forehand_error)

print("Deuce Backhand Return Points - Made:", count_deuce_backhand_made)
print("Deuce Backhand Return Points - Error:", count_deuce_backhand_error)

# Assuming return_points is your DataFrame
ad_return_points = return_points[(return_points['side'] == 'Ad') & (return_points['returnerName'] == first_player1Name) & (return_points['rallyCount'] >= 2)]

# Ad Return Points Separated by returnFhBh
ad_forehand_return_points = ad_return_points[ad_return_points['returnFhBh'] == 'Forehand']
ad_backhand_return_points = ad_return_points[ad_return_points['returnFhBh'] == 'Backhand']

# Count for Ad Return Points - Made
count_ad_forehand_made = len(ad_forehand_return_points[(ad_forehand_return_points['rallyCount'] > 2) | ((ad_forehand_return_points['rallyCount'] == 2) & (ad_forehand_return_points['lastShotResult'] != 'Error'))])
count_ad_backhand_made = len(ad_backhand_return_points[(ad_backhand_return_points['rallyCount'] > 2) | ((ad_backhand_return_points['rallyCount'] == 2) & (ad_backhand_return_points['lastShotResult'] != 'Error'))])

# Count for Ad Return Points - Error
count_ad_forehand_error = len(ad_forehand_return_points[(ad_forehand_return_points['lastShotResult'] == 'Error') & (ad_forehand_return_points['rallyCount'] == 2)])
count_ad_backhand_error = len(ad_backhand_return_points[(ad_backhand_return_points['lastShotResult'] == 'Error') & (ad_backhand_return_points['rallyCount'] == 2)])

# Display the counts
print("\nAd Forehand Return Points - Made:", count_ad_forehand_made)
print("Ad Forehand Return Points - Error:", count_ad_forehand_error)

print("Ad Backhand Return Points - Made:", count_ad_backhand_made)
print("Ad Backhand Return Points - Error:", count_ad_backhand_error)

print(f"\nAt Net Results for {first_player1Name}:\n")


# Total points where atNetPlayer1 = first_player1Name
total_at_net_player1 = len(point_df_eda[point_df_eda['atNetPlayer1'] == first_player1Name])

# Percentage of points where atNetPlayer1 = 1 out of total points
percentage_at_net_player1 = (total_at_net_player1 / len(point_df_eda)) * 100 if len(point_df_eda) > 0 else 0

# Display the total count and percentage of points where atNetPlayer1 = 1
print(f"Total Net Points for {first_player1Name}: {total_at_net_player1}")
print(f"Percentage of Net Points for {first_player1Name}: {percentage_at_net_player1:.2f}%")

# Points where atNetPlayer1 = first_player1Name and pointWonBy = first_player1Name
at_net_player1_and_won_by_player1 = len(point_df_eda[(point_df_eda['atNetPlayer1'] == first_player1Name) & (point_df_eda['pointWonBy'] == first_player1Name)])

# Percentage of points where atNetPlayer1 = first_player1Name and pointWonBy = first_player1Name out of total points where atNetPlayer1 = 1
percentage_at_net_player1_and_won_by_player1 = (at_net_player1_and_won_by_player1 / total_at_net_player1) * 100 if total_at_net_player1 > 0 else 0

# Display the count and percentage of points where atNetPlayer1 = 1 and pointWonBy = first_player1Name
print(f"\nTotal Net Points won by {first_player1Name}: {at_net_player1_and_won_by_player1}")
print(f"Percentage of Net Points won by {first_player1Name}: {percentage_at_net_player1_and_won_by_player1:.2f}%")



Serve Results for Kaylan Bigun:

Total Serves: 48
First Serve In (Count): 32
First Serve Won (Count): 22
First Serve In (%): 66.67%
First Serve Won (%): 68.75%
Second Serve In (Count): 12
Second Serve Total (Count): 16
Second Serve Won (Count): 4
Second Serve In (%): 75.00%
Second Serve Won (%): 33.33%
Ace (Count): 3
Double Fault (Count): 4
Points Won on Serve (Count) 27
Points Won on Serve (%): 56.25%

Return Results for Kaylan Bigun:

Total Return (Count): 64
Return Won (Count): 23
Return Won (%): 50.0

Return Made (Count): 46
Return Made (%): 0.71875
Return Error (Count): 17
Return Winner (Count): 3

Deuce Return (Count): 33
Deuce Return Made (Count): 24
Deuce Return Made (%): 0.7272727272727273
Deuce Return Won by Player1 (%): 45.83333333333333
Deuce Return Won by Player1 (Count): 11

Ad Return (Count): 31
Ad Return Made (Count): 22
Ad Return Made (%): 0.7096774193548387
Ad Return Won by Player1 (Count): 16
Ad Return Won by Player1 (%): 72.72727272727273

Deuce Forehand Return Poi

### Breakpoint Stats

In [139]:
# Caitlin Breakpoint Data

# points_returned = point_df_eda[point_df_eda[]]

# print(f"\nBreakpoint Results for {first_player1Name}:\n")

# # Total points where isBreakPoint = 1
# total_breakpoint = len(point_df_eda[point_df_eda['isBreakPoint'] == 1])

# # Points where isBreakPoint = 1 and pointWonBy = first_player1Name
# breakpoint_and_won_by_player1 = len(point_df_eda[(point_df_eda['isBreakPoint'] == 1) & (point_df_eda['pointWonBy'] == first_player1Name)])

# # Percentage of points where isBreakPoint = 1 and pointWonBy = first_player1Name out of total points where isBreakPoint = 1
# percentage_breakpoint_and_won_by_player1 = (breakpoint_and_won_by_player1 / total_breakpoint) * 100 if total_breakpoint > 0 else 0

# # Display the total count of points where isBreakPoint = 1
# print(f"Total Breakpoints: {total_breakpoint}")

# # Display the count and percentage of points where isBreakPoint = 1 and pointWonBy = first_player1Name
# print(f"Total Breakpoints won by {first_player1Name}: {breakpoint_and_won_by_player1}")
# print(f"Percentage of Breakpoints won by {first_player1Name}: {percentage_breakpoint_and_won_by_player1:.2f}%")

# # Total points where isBreakPoint = 1 and serverName = first_player1Name
# total_breakpoint_serve = len(point_df_eda[(point_df_eda['isBreakPoint'] == 1) & (point_df_eda['serverName'] == first_player1Name)])

# # Points where isBreakPoint = 1, serverName = first_player1Name, and pointWonBy = first_player1Name
# breakpoint_and_won_by_player1_serve = len(point_df_eda[(point_df_eda['isBreakPoint'] == 1) & (point_df_eda['serverName'] == first_player1Name) & (point_df_eda['pointWonBy'] == first_player1Name)])

# # Percentage of points where isBreakPoint = 1, serverName = first_player1Name, and pointWonBy = first_player1Name out of total points where isBreakPoint = 1 and serverName = first_player1Name
# percentage_breakpoint_and_won_by_player1_serve = (breakpoint_and_won_by_player1_serve / total_breakpoint_serve) * 100 if total_breakpoint_serve > 0 else 0

# # Display the total count of points where isBreakPoint = 1 and serverName = first_player1Name
# print(f"\nTotal Breakpoints on Serve for {first_player1Name}: {total_breakpoint_serve}")

# # Display the count and percentage of points where isBreakPoint = 1, serverName = first_player1Name, and pointWonBy = first_player1Name
# print(f"Total Breakpoints won on Serve by {first_player1Name}: {breakpoint_and_won_by_player1_serve}")
# print(f"Percentage of Breakpoints won on Serve by {first_player1Name}: {percentage_breakpoint_and_won_by_player1_serve:.2f}%")

# Total points where isBreakPoint = 1 and returnerName = first_player1Name
total_breakpoint_return = len(point_df_eda[(point_df_eda['isBreakPoint'] == 1) & (point_df_eda['returnerName'] == first_player1Name)])

# Points where isBreakPoint = 1, returnerName = first_player1Name, and pointWonBy = first_player1Name
breakpoint_and_won_by_player1_return = len(point_df_eda[(point_df_eda['isBreakPoint'] == 1) & (point_df_eda['returnerName'] == first_player1Name) & (point_df_eda['pointWonBy'] == first_player1Name)])

# Percentage of points where isBreakPoint = 1, returnerName = first_player1Name, and pointWonBy = first_player1Name out of total points where isBreakPoint = 1 and returnerName = first_player1Name
percentage_breakpoint_and_won_by_player1_return = (breakpoint_and_won_by_player1_return / total_breakpoint_return) * 100 if total_breakpoint_return > 0 else 0

# Display the total count of points where isBreakPoint = 1 and returnerName = first_player1Name
print(f"\nTotal Breakpoints on Return for {first_player1Name}: {total_breakpoint_return}")

# Display the count and percentage of points where isBreakPoint = 1, returnerName = first_player1Name, and pointWonBy = first_player1Name
print(f"Total Breakpoints won on Return by {first_player1Name}: {breakpoint_and_won_by_player1_return}")
print(f"Percentage of Breakpoints won on Return by {first_player1Name}: {percentage_breakpoint_and_won_by_player1_return:.2f}%")

# Jimmy Returning Games Won
# games won/returning games by Jimmy Hou

points_returned = point_df_eda[point_df_eda["returnerName"] == first_player1Name]

# # Return percentage won on first serve
# first_serves_won = points_returned[(points_returned['firstServeIn'] == 1) & (points_returned['pointWonBy'] == first_player1Name)]
# total_first_serves = points_returned[points_returned['firstServeIn'] == 1]
# fs_won_per_player1 = 100 * len(first_serves_won) / len(total_first_serves)
# # print(f"\n{player1} won {fs_won_per_player1:.2f}% of first serves returned.")
# print(f"\nPerecentage of Breakpoinnts won by {first_player1Name} on Return when returning a first serve: {fs_won_per_player1:.2f}%")

# # Return percentage won on second serve
# second_serves_won = points_returned[(points_returned['secondServeIn'] == 1) & (points_returned['pointWonBy'] == first_player1Name)]
# total_second_serves = points_returned[points_returned['secondServeIn'] == 1]
# ss_won_per_player1 = 100 * len(second_serves_won) / len(total_second_serves)
# # print(f"{player1} won {ss_won_per_player1:.2f}% of second serves returned."
# print(f"Perecentage of Breakpoinnts won by {first_player1Name} on Return when returning a second serve: {ss_won_per_player1:.2f}%")



Total Breakpoints on Return for Kaylan Bigun: 4
Total Breakpoints won on Return by Kaylan Bigun: 2
Percentage of Breakpoints won on Return by Kaylan Bigun: 50.00%


In [140]:
point_df_eda.columns

Index(['pointNumber', 'player1Name', 'player2Name', 'pointScore', 'gameScore',
       'setScore', 'tiebreakScore', 'side', 'serverName', 'returnerName',
       'clientTeam', 'opponentTeam', 'Position', 'pointEndPosition',
       'Duration', 'rallyCount', 'rallyCountFreq', 'firstServeIn',
       'secondServeIn', 'serveResult', 'serveInPlacement', 'firstServeZone',
       'secondServeZone', 'isAce', 'serverFarNear', 'serverStartLocation',
       'returnerStartLocation', 'firstServeXCoord', 'firstServeYCoord',
       'secondServeXCoord', 'secondServeYCoord', 'returnDirection',
       'returnFhBh', 'errorType', 'returnError', 'lastShotDirection',
       'lastShotFhBh', 'lastShotHitBy', 'lastShotResult', 'pointWonBy',
       'isExcitingPoint', 'isBreakPoint', 'atNetPlayer1', 'atNetPlayer2',
       'setNum', 'player1SetScore', 'player2SetScore', 'player1GameScore',
       'player2GameScore', 'player1PointScore', 'player2PointScore',
       'player1TiebreakScore', 'player2TiebreakScore', 'gam

### Serve Win Percentage

In [141]:
# Filter points where serverName is equal to first_player1Name
filtered_points = point_df[point_df['serverName'] == first_player1Name]

# Group the filtered points by player1ServePlacement and count the occurrences
serve_placement_counts = filtered_points.groupby('player1ServePlacement').size()

serve_placements_ad = serve_placement_counts.filter(like='Ad').sum()
serve_placements_deuce = serve_placement_counts.filter(like='Deuce').sum()

# Iterate over filtered_points
for index, point in filtered_points.iterrows():
    serve_placement = point['player1ServePlacement']

    # Check if serve placement is not in serve_placement_counts
    if serve_placement not in serve_placement_counts:
        print(point)


# Initialize dictionaries to store counts and percentages
point_won_counts = {}
point_won_percentages = {}
print(f"Total {len(filtered_points)}")

# Iterate over serve placements
for serve_placement, count in serve_placement_counts.items():
    # Filter points with the specific serve placement
    serve_placement_points = filtered_points[filtered_points['player1ServePlacement'] == serve_placement]
    
    # Count points won by first_player1Name
    point_won_count = serve_placement_points[serve_placement_points['pointWonBy'] == first_player1Name].shape[0]

    # Calculate percentage
    point_won_percentage = (point_won_count / count) * 100 if count > 0 else 0

    # Store counts and percentages
    point_won_counts[serve_placement] = point_won_count
    point_won_percentages[serve_placement] = point_won_percentage
    


    
    
# Filter serve placements for Ad and Deuce [CHANGED: BRIAN NTOES]
serve_placements_ad = serve_placement_counts.filter(like='Ad').sum()
serve_placements_deuce = serve_placement_counts.filter(like='Deuce').sum()    

# Print counts and percentages [CHANGED: BRIAN NOTES]
for serve_placement, count in serve_placement_counts.items():
    print(f"Serve Placement: {serve_placement}")
    print(f"Total Serves: {count}")
    
    if "Deuce" in serve_placement: # [CHANGED: BRIAN NOTES]
        deuce_serve_format_number = f"{count/serve_placements_deuce:.2f}"
        deuce_serve_percent = round(float(deuce_serve_format_number) * 100,2)
        print(f"Serve Frequency: {deuce_serve_percent}% ({count}/{serve_placements_deuce}) Deuce Serves")
    if "Ad" in serve_placement: # [CHANGED: BRIAN NOTES]
        ad_serve_format_number = f"{count/serve_placements_ad:.2f}"
        ad_serve_percent = round(float(ad_serve_format_number) * 100,2)
        print(f"Serve Frequency: {ad_serve_percent}% ({count}/{serve_placements_ad}) Ad Serves")
        
    print(f"Serves Won by {first_player1Name}: {point_won_counts.get(serve_placement, 0)}")
    print(f"Percentage: {point_won_percentages.get(serve_placement, 0):.2f}%\n")

    
# print("This is Ad count: " + str(len(point_df[(point_df['serverName'] == first_player1Name) & (point_df['side'] == 'Ad')])))
# print("This is Deuce count: " + str(len(point_df[(point_df['serverName'] == first_player1Name) & (point_df['side'] == 'Deuce')])))

Total 48
Serve Placement: 
Total Serves: 2
Serves Won by Kaylan Bigun: 1
Percentage: 50.00%

Serve Placement: Ad: Body
Total Serves: 1
Serve Frequency: 4.0% (1/24) Ad Serves
Serves Won by Kaylan Bigun: 0
Percentage: 0.00%

Serve Placement: Ad: T
Total Serves: 5
Serve Frequency: 21.0% (5/24) Ad Serves
Serves Won by Kaylan Bigun: 3
Percentage: 60.00%

Serve Placement: Ad: Wide
Total Serves: 18
Serve Frequency: 75.0% (18/24) Ad Serves
Serves Won by Kaylan Bigun: 11
Percentage: 61.11%

Serve Placement: Deuce: Body
Total Serves: 3
Serve Frequency: 14.0% (3/22) Deuce Serves
Serves Won by Kaylan Bigun: 0
Percentage: 0.00%

Serve Placement: Deuce: T
Total Serves: 14
Serve Frequency: 64.0% (14/22) Deuce Serves
Serves Won by Kaylan Bigun: 8
Percentage: 57.14%

Serve Placement: Deuce: Wide
Total Serves: 5
Serve Frequency: 23.0% (5/22) Deuce Serves
Serves Won by Kaylan Bigun: 4
Percentage: 80.00%



### Error Stats

In [142]:
print(f"\nError Data for {first_player1Name}:\n")
# Filter the DataFrame based on specified conditions
total_errors = point_df[(point_df['lastShotHitBy'] == first_player1Name) &
                           (point_df['lastShotResult'] == 'Error')]

import numpy as np

# Filter rows without NaN values in relevant columns
forehand_errors = point_df[(point_df['lastShotHitBy'] == first_player1Name) &
                           (point_df['lastShotResult'] == 'Error') &
                           (point_df['lastShotFhBh'] == 'Forehand') &
                           (~point_df['errorType'].isnull())]  # Ensure 'errorType' column doesn't have NaN
 
backhand_errors = point_df[(point_df['lastShotHitBy'] == first_player1Name) &
                           (point_df['lastShotResult'] == 'Error') &
                           (point_df['lastShotFhBh'] == 'Backhand') &
                           (~point_df['errorType'].isnull())]  # Ensure 'errorType' column doesn't have NaN

# Count the occurrences of 'Forehand' and 'Backhand' separately
forehand_counts = forehand_errors.shape[0]  # Count rows
backhand_counts = backhand_errors.shape[0]  # Count rows

# Print the total error counts for verification
total_error_counts = forehand_counts + backhand_counts



# Desired output order 
desired_order = ['Net', 'Long', 'Wide Right', 'Wide Left'] 

# # Get value counts of 'errorType' for Forehand errors [CHANGED: BRIAN]
# forehand_error_types = forehand_errors['errorType'].value_counts(dropna=False).loc[desired_order]  # Include NaN values in count
# forehand_error_types_df = pd.DataFrame(forehand_error_types) # change into dataframe to erase object line
# Get value counts of 'errorType' for Backhand errors



######################### CHANGED SECTION ##############################################
# [CHANGED: BRIAN]

forehand_error_types = forehand_errors['errorType'].value_counts(dropna=False)

# Create a Series with desired index containing zeros
zeros_series = pd.Series(0, index=desired_order)

# Combine the original Series with the zeros Series
forehand_error_types_combined = forehand_error_types.combine(zeros_series, max, fill_value=0)

# Reindex the Series to follow the desired order
forehand_error_types_ordered = forehand_error_types_combined.reindex(desired_order, fill_value=0)

# Create the DataFrame
forehand_error_types_df = pd.DataFrame(forehand_error_types_ordered, columns=['Count'])

# Get value counts of 'errorType' for Backhand errors [CHANGED: BRIAN]
# backhand_error_types = backhand_errors['errorType'].value_counts(dropna=False).loc[desired_order]  # Include NaN values in count
# backhand_error_types_df = pd.DataFrame(backhand_error_types) # change into dataframe to erase object line

# Get value counts of 'errorType' for Backhand errors
backhand_error_types = backhand_errors['errorType'].value_counts(dropna=False)

# Combine the original Series with the zeros Series
backhand_error_types_combined = backhand_error_types.combine(zeros_series, max, fill_value=0)

# Reindex the Series to follow the desired order
backhand_error_types_ordered = backhand_error_types_combined.reindex(desired_order, fill_value=0)

# Create the DataFrame
backhand_error_types_df = pd.DataFrame(backhand_error_types_ordered, columns=['Count'])



######################### CHANGED SECTION ##############################################


# Print the counts and error types
print("Count of Total errors:", total_error_counts)
print("Count of Forehand errors:", forehand_counts)
print(f"Forehand Error %: {(forehand_counts/total_error_counts)*100:.2f}%")
print("Count of Backhand errors:", backhand_counts)
print(f"Backhand Error %: {(backhand_counts/total_error_counts)*100:.2f}%")
print("\nForehand errors:\n", forehand_error_types_df)


# # Group by both 'lastShotDirection' and 'errorType', and then count occurrences
forehand_error_counts = forehand_errors.groupby(['player1LastShotPlacement', 'errorType']).size().unstack(fill_value=0)  # Fill NaN with 0
forehand_error_counts_ordered = forehand_error_counts.reindex(columns = desired_order) # [CHANGED: BRIAN]

print("\nValue counts of 'errorType' for Forehand errors with different directions:\n", forehand_error_counts_ordered)


print("\nBackhand errors:\n", backhand_error_types_df)


# Group by both 'lastShotDirection' and 'errorType', and then count occurrences
backhand_error_counts = backhand_errors.groupby(['player1LastShotPlacement', 'errorType']).size().unstack(fill_value=0) # Fill NaN with 0
backhand_error_counts_ordered = backhand_error_counts.reindex(columns = desired_order) # [CHANGED: BRIAN]

print("\nValue counts of 'errorType' for Backhand errors with different directions:\n", backhand_error_counts_ordered)


Error Data for Kaylan Bigun:

Count of Total errors: 41
Count of Forehand errors: 15
Forehand Error %: 36.59%
Count of Backhand errors: 26
Backhand Error %: 63.41%

Forehand errors:
             Count
Net            10
Long            4
Wide Right      0
Wide Left       1

Value counts of 'errorType' for Forehand errors with different directions:
 errorType                 Net  Long  Wide Right  Wide Left
player1LastShotPlacement                                  
                            1     0         NaN          0
Crosscourt                  3     3         NaN          0
Down the Line               6     1         NaN          1

Backhand errors:
             Count
Net            19
Long            2
Wide Right      1
Wide Left       4

Value counts of 'errorType' for Backhand errors with different directions:
 errorType                 Net  Long  Wide Right  Wide Left
player1LastShotPlacement                                  
Crosscourt                 12     1           0   