## <center><u>Basketball Team Analysis</u></center>
- **DOMAIN:** Sports
- **CONTEXT:** Company X manages the men's top professional basketball division of the American league system.
The dataset contains information on all the teams that have participated in all the past tournaments. It has data
about how many baskets each team scored, conceded, how many times they came within the first 2 positions,
how many tournaments they have qualified, their best position in the past, etc.
- **DATA DESCRIPTION:** Basketball.csv - The data set contains information on all the teams so far participated in
all the past tournaments.
- **ATTRIBUTE INFORMATION:**
 1. `Team`: Team’s name
 2. `Tournament`: Number of played tournaments.
 3. `Score`: Team’s score so far.
 4. `PlayedGames`: Games played by the team so far.
 5. `WonGames`: Games won by the team so far.
 6. `DrawnGames`: Games drawn by the team so far.
 7. `LostGames`: Games lost by the team so far.
 8. `BasketScored`: Basket scored by the team so far.
 9. `BasketGiven`: Basket scored against the team so far.
 10. `TournamentChampion`: How many times the team was a champion of the tournaments so far.
 11. `Runner-up`: How many times the team was a runners-up of the tournaments so far.
 12. `TeamLaunch`: Year the team was launched on professional basketball.
 13. `HighestPositionHeld`: Highest position held by the team amongst all the tournaments played.
- **PROJECT OBJECTIVE:** Company’s management wants to invest on proposal on managing some of the best
teams in the league. The analytics department has been assigned with a task of creating a report on the
performance shown by the teams. Some of the older teams are already in contract with competitors. Hence
Company X wants to understand which teams they can approach which will be a deal win for them.
___
### Imports and Configurations

In [1]:
# Import all the libraries needed to load the dataset and visualize it
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
# Configure for any default setting of any library
%matplotlib inline
sns.set(style='darkgrid', palette='deep', font='sans-serif', font_scale=1.3, color_codes=True)

**Comments**
- **``%matplotlib inline``** sets the backend of matplotlib to the 'inline' backend: With this backend, the output of plotting commands is displayed inline without needing to call plt.show() every time a data is plotted.
- Set few of the Seaborn's asthetic parameters

### Load the Dataset

In [3]:
# Load the dataset into a Pandas dataframe called basketball
basketball = pd.read_csv('DS - Part2 - Basketball.csv')

In [4]:
# Check the head of the dataset
basketball.head()

Unnamed: 0,Team,Tournament,Score,PlayedGames,WonGames,DrawnGames,LostGames,BasketScored,BasketGiven,TournamentChampion,Runner-up,TeamLaunch,HighestPositionHeld
0,Team 1,86,4385,2762,1647,552,563,5947,3140,33,23,1929,1
1,Team 2,86,4262,2762,1581,573,608,5900,3114,25,25,1929,1
2,Team 3,80,3442,2614,1241,598,775,4534,3309,10,8,1929,1
3,Team 4,82,3386,2664,1187,616,861,4398,3469,6,6,1931to32,1
4,Team 5,86,3368,2762,1209,633,920,4631,3700,8,7,1929,1


In [5]:
# Check the tail of the dataset
basketball.tail()

Unnamed: 0,Team,Tournament,Score,PlayedGames,WonGames,DrawnGames,LostGames,BasketScored,BasketGiven,TournamentChampion,Runner-up,TeamLaunch,HighestPositionHeld
56,Team 57,1,34,38,8,10,20,38,66,-,-,2009-10,20
57,Team 58,1,22,30,7,8,15,37,57,-,-,1956-57,16
58,Team 59,1,19,30,7,5,18,51,85,-,-,1951~52,16
59,Team 60,1,14,30,5,4,21,34,65,-,-,1955-56,15
60,Team 61,1,-,-,-,-,-,-,-,-,-,2017~18,9


**Comments**: To take a closer look at the data, pandas library provides **“.head()”** function which returns first five observations and **“.tail()”** function which returns last five observations of the data set.

### Inspect the Dataset

In [6]:
# Get the shape and size of the dataset
basketball.shape

(61, 13)

In [7]:
# Get more info on it
# 1. Name of the columns
# 2. Find the data types of each columns
# 3. Look for any null/missing values
basketball.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 61 entries, 0 to 60
Data columns (total 13 columns):
Team                   61 non-null object
Tournament             61 non-null int64
Score                  61 non-null object
PlayedGames            61 non-null object
WonGames               61 non-null object
DrawnGames             61 non-null object
LostGames              61 non-null object
BasketScored           61 non-null object
BasketGiven            61 non-null object
TournamentChampion     61 non-null object
Runner-up              61 non-null object
TeamLaunch             61 non-null object
HighestPositionHeld    61 non-null int64
dtypes: int64(2), object(11)
memory usage: 6.3+ KB


**Observations**
- This dataset contains **61** observations with **13** independant attribues
- All columns are of type string where as only _Tournament_ and _HighestPositionHeld_ are of type integer
- There are **No null/missing values** present in the dataset

### Data Visualization
Exploratory Data Analysis(EDA) is incomplete without Data Visualization. It's a pictorial representation of data using beautiful graphs. It enables us to see analytics presented visually which helps in grasping unnoticed information or identify new patterns.