## **NFL PLAYERS (Average Age, Average Weight, and Average Height)** <a id="1"></a>

<a><img style="float: right;" src="https://www.linkpicture.com/q/nfl_1.jpg" width="300" /></a>
 



- Dataset source: https://www.kaggle.com/datasets/kendallgillies/nflstatistics

### 1.2 Notebook Preparation <a id="1.2"></a>

This part of the notebook deals with the relevant library import and visual configuration.

In [6]:
# Import libraries

import pandas as pd
import numpy as np 
from scipy import stats

import matplotlib.pyplot as plt
import seaborn as sns
import plotly
import plotly.graph_objects as go
import plotly.express as px

from sklearn.preprocessing import StandardScaler

from sklearn.cluster import KMeans
from scipy.cluster.hierarchy import linkage
from scipy.cluster.hierarchy import dendrogram
from sklearn.metrics import silhouette_samples, silhouette_score

In [7]:
# Set notebook options

pd.set_option('precision',2)
pd.options.display.max_columns = 30

import warnings
warnings.filterwarnings("ignore")

## **2. Data Preparation** <a id="2"></a>

The below section provides an initial exploration of the data.

In [8]:
# Import the data as a DataFrame and check first 5 rows

df = pd.read_csv('nfl_player.csv')

df.head(5)

Unnamed: 0,Age,Birth Place,Birthday,College,Current Status,Current Team,Experience,Height (inches),High School,High School Location,Name,Number,Player Id,Position,Weight (lbs),Years Played
0,,"Grand Rapids , MI",5/23/1921,Notre Dame,Retired,,3 Seasons,71.0,,,"Evans, Fred",,fredevans/2513736,,185.0,1946 - 1948
1,,"Dayton , OH",12/21/1930,Dayton,Retired,,1 Season,70.0,,,"Raiff, Jim",,jimraiff/2523700,,235.0,1954 - 1954
2,56.0,"Temple , TX",9/11/1960,Louisiana Tech,Retired,,1 Season,74.0,,,"Fowler, Bobby",,bobbyfowler/2514295,,230.0,1985 - 1985
3,30.0,"New Orleans , LA",9/30/1986,LSU,Retired,,5 Seasons,73.0,,,"Johnson, Quinn",,quinnjohnson/79593,,255.0,2009 - 2013
4,25.0,"Detroit , MI",3/31/1992,Central Michigan,Active,Pittsburgh Steelers,3rd season,77.0,Clintondale HS,"Clinton Twp.,Macomb Co., MI","Walton, L.T.",96.0,l.t.walton/2552444,DE,305.0,


In [12]:
# Let us count the number of rows and columns in our imported NFL dataset.

df.shape

(17172, 16)

In [14]:
# 17,172 rows and 16 columns of dataset. Fairly good dataset for finding average age, average height, and average 

# weight.

In [9]:
# Check data types and if any records are missing

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17172 entries, 0 to 17171
Data columns (total 16 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Age                   13504 non-null  float64
 1   Birth Place           14794 non-null  object 
 2   Birthday              16835 non-null  object 
 3   College               17133 non-null  object 
 4   Current Status        17172 non-null  object 
 5   Current Team          3096 non-null   object 
 6   Experience            17133 non-null  object 
 7   Height (inches)       17026 non-null  float64
 8   High School           2514 non-null   object 
 9   High School Location  2510 non-null   object 
 10  Name                  17172 non-null  object 
 11  Number                1708 non-null   float64
 12  Player Id             17172 non-null  object 
 13  Position              3096 non-null   object 
 14  Weight (lbs)          17121 non-null  float64
 15  Years Played       

- Seems we have missing records in our datasets. Let us count the number of missing values in each column.

In [11]:
# We can count the missing values in each column of our dataset.

df.isnull().sum()

Age                      3668
Birth Place              2378
Birthday                  337
College                    39
Current Status              0
Current Team            14076
Experience                 39
Height (inches)           146
High School             14658
High School Location    14662
Name                        0
Number                  15464
Player Id                   0
Position                14076
Weight (lbs)               51
Years Played             3096
dtype: int64

- We are interested in finding the average age, average height and average weight of NFL players. There are few missing age, weight and height records. Deleting these missing age, weight and height records from our NFL dataset is reasonably good. 

In [18]:
# We are deleting rows with missing Age, Height,and Weight records

df = df.dropna(subset=['Age', 'Height (inches)', 'Weight (lbs)'])

In [19]:
# Let us check if we still have missing Age, Height, and Weight records in the NFL dataset.

df.isnull().sum()

Age                         0
Birth Place              1987
Birthday                    0
College                     0
Current Status              0
Current Team            10537
Experience                  0
Height (inches)             0
High School             11006
High School Location    11010
Name                        0
Number                  11790
Player Id                   0
Position                10537
Weight (lbs)                0
Years Played             2959
dtype: int64

In [20]:
# No missing Age, Weight and Height records.

# Let us count the number of records we have after deleting the missing Age, Weight, and Height records

In [21]:
df.shape

(13496, 16)

In [23]:
# We still have 13,496 rows and 16 columns dataset. Still sizeably good for finding average values.

- Let us extract Age, Weight and Height information from the dataset

In [35]:
# Here, we extract the ages of NFL players from our dataset, and display the first ten

df_Age = df['Age']

df_Age.head(10)

2     56.0
3     30.0
4     25.0
7     22.0
9     98.0
11    56.0
12    33.0
13    27.0
14    73.0
15    74.0
Name: Age, dtype: float64

In [42]:
# Let us find the average age of NFL players (years)

df_Age_mean = df_Age.mean()

df_Age_mean

43.82683758150563

In [38]:
# Here, we extract the weight of NFL players from our dataset, and display the first ten

df_Weight = df['Weight (lbs)']

df_Weight.head(10)

2     230.0
3     255.0
4     305.0
7     250.0
9     175.0
11    180.0
12    219.0
13    185.0
14    220.0
15    210.0
Name: Weight (lbs), dtype: float64

In [39]:
# Let us find the average weight of NFL players (lbs)

df_Weight_mean = df_Weight.mean()

df_Age_mean

43.82683758150563

In [40]:
# Here, we extract the heights of NFL players from our dataset, and display the first ten

df_Height = df['Height (inches)']

df_Height.head(10)

2     74.0
3     73.0
4     77.0
7     77.0
9     70.0
11    69.0
12    74.0
13    71.0
14    73.0
15    73.0
Name: Height (inches), dtype: float64

In [41]:
# Let us find the average height of NFL players (inches)

df_Height_mean = df_Height.mean()

df_Height_mean

73.81735328986366

## **5. Conclusion** <a id="5"></a>

- The average values of NFL players are the following;

  average age = roughly 43 years
  
  average weight = 43.83 lbs
  
  average weight =  73.82 inches