# 🏈 America's  National Football League (NFL)
Big Data Analysis - The most popular sports league 

![](https://cdn.iconscout.com/icon/free/png-256/nfl-283167.png)

# The Challenge 🎯

Your challenge is to generate actionable, practical, and novel insights from player tracking data that corresponds to special teams play. There are several potential topics for participants to analyze. These include, but are not limited to:

1. Create a new special teams metric. The winning algorithm from the 2020 Big Data Bowl has been adopted by the NFL/NFL Network for on air distribution, and we are hopeful that there could be a new stat for special teams plays that could come from this year’s competition

2. Quantify special teams strategy. Special teams’ coaches are among the most creative and innovative in the league. Compare/contrast how each team game plans. Which strategies yield the best results? What are other strategies that could be adopted?

3. Rank special teams players. Each team employs a variety of players (including longsnappers, kickers, punters, and other utility special teams players). How do they stack up with respect to one another?

Competition Steps:

* Import the required Python libraries
* Load and explore data (Exploration Data) (EDA)
* Cleaning Data / Null Values
* Data analysis, Visualization and histograms

# Import the required Python libraries 📄

In [None]:
import numpy as np 
import pandas as pd 

import matplotlib.pyplot as plt 
import matplotlib as mpl
import seaborn as sns 
from datetime import datetime
from wordcloud import WordCloud, STOPWORDS


import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))


# Loading and Exploration Data (EDA)

# Games Data 📊

In [None]:
# Loading dataset games.csv
games_data = pd.read_csv('../input/nfl-big-data-bowl-2022/games.csv')
games_data.head()

In [None]:
# Get Total (rows, cols) using shape
games_data.shape

# Cleaning Data / Null Values

In [None]:
# Check for Null Values
null_values = games_data.isnull().sum()
null_values

> No null values

# Data Analysis, Visualization and Histograms

* **gameId**	Game identifier, unique (numeric)
* **season**	Season of game
* **week**	Week of game
* **gameDate**	Game Date (time, mm/dd/yyyy)
* **gameTimeEastern**	Start time of game (time, HH:MM:SS, EST)
* **homeTeamAbbr**	Home team three-letter code (text)
* **visitorTeamAbbr**	Visiting team three-letter code (text

In [None]:
games_data.info()


**Group By Season**

In [None]:
games_data.groupby('season').size()

<iframe src="https://www.kaggle.com/embed/girishkumarsahu/nfl-big-data-bowl-2022?cellIds=10&kernelSessionId=79075208" height="300" style="margin: 0 auto; width: 100%; max-width: 950px;" frameborder="0" scrolling="auto" title="NFL_Big_Data_Bowl_2022"></iframe>


In [None]:
# convert integers to date 
games_data["gameDate"] = pd.to_datetime(games_data["gameDate"])
# extract month
games_data['Year'] = games_data["gameDate"].dt.month_name()
# extract days
games_data['Week'] = games_data["gameDate"].dt.day_name()

sns.set_style("whitegrid")
plt.figure(figsize=(8, 6))
ax = plt.gca()

sns.countplot(x='season', data=games_data, hue='Year', lw=2, ax=ax).set(title='Games in Year, Season')
ax.legend(loc='center right', bbox_to_anchor=(1.5, 0.5), ncol=1)
plt.show()

# Plays Data 📊

In [None]:
plays_data = pd.read_csv("../input/nfl-big-data-bowl-2022/plays.csv")
plays_data.head()

In [None]:
plays_data.shape

# Players Data 📊

In [None]:
players_data = pd.read_csv("../input/nfl-big-data-bowl-2022/players.csv")
players_data

**Texas Tech**

In [None]:
players_data.query('collegeName == "Texas Tech"')

In [None]:
collegs_data = players_data.groupby('collegeName').size()

In [None]:
mpl.rcParams['font.size']=12                 
mpl.rcParams['savefig.dpi']=100             
mpl.rcParams['figure.subplot.bottom']=.1 

stopwords = set(STOPWORDS)

wordcloud = WordCloud(
                          background_color='white',
                          stopwords=stopwords,
                          max_words=400,
                          max_font_size=40, 
                          random_state=42
                         ).generate(str(players_data['collegeName']))

print(wordcloud)
fig = plt.figure(1)
plt.imshow(wordcloud)
plt.axis('off')
plt.show()
fig.savefig("word1.png", dpi=900)

**Players names words cloud**

In [None]:
mpl.rcParams['font.size']=12                 
mpl.rcParams['savefig.dpi']=100             
mpl.rcParams['figure.subplot.bottom']=.1 

stopwords = set(STOPWORDS)

wordcloud = WordCloud(
                          background_color='white',
                          stopwords=stopwords,
                          max_words=200,
                          max_font_size=30, 
                          random_state=42
                         ).generate(str(players_data['displayName']))

print(wordcloud)
fig = plt.figure(1)
plt.imshow(wordcloud)
plt.axis('off')
plt.show()
fig.savefig("word1.png", dpi=900)

In [None]:
collegs_data.head()

In [None]:
collegs_data.tail()

**Number of Collegs partecipating NFL**

In [None]:
# America's  National Football League
print('Number of Collegs', collegs_data.shape)

**Count of Players by Position**

In [None]:
players_data['Position'].value_counts() \
    .sort_values(ascending=True) \
    .plot(kind='barh', figsize=(10, 15                               ),
         title='Count of Players by Position')
plt.show()

<iframe src="https://www.kaggle.com/embed/girishkumarsahu/nfl-big-data-bowl-2022?cellIds=10&kernelSessionId=79075208" height="300" style="margin: 0 auto; width: 100%; max-width: 950px;" frameborder="0" scrolling="auto" title="NFL_Big_Data_Bowl_2022"></iframe>

Please consider voting if you find it useful 👍

**Thanks**