In this Kaggle notebook, we will delve into a comparative analysis of Adidas and Nike, two titans in the global sportswear market. Our objective is to understand which brand is more discount-friendly, which has a broader product range, how many products they share, and which of their products are highest rated by consumers.

We will leverage a dataset that provides detailed information on the products offered by both brands, including pricing, product types, and customer reviews. Through exploratory data analysis (EDA), we will scrutinize the data to answer the following key questions:

    Discount Analysis: How do Adidas and Nike compare in terms of their discount strategies? Which brand offers more significant discounts?
    Product Range: Which brand has a wider variety of products? How does the diversity of their product lineup compare?
    Common Products: How many products are common to both Adidas and Nike? This will help us understand the overlap in their offerings.
    Highest Rated Products: By analyzing customer reviews, we will identify the top-rated products from each brand. This will give us insights into the preferences of consumers and the strengths of each brand.

By the end of this notebook, we aim to provide a comprehensive comparison that can inform strategic decisions for both consumers and stakeholders interested in the sportswear industry.


In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns


These imports are essential tools for data analysis and visualization in Python:

numpy (aliased as np) is used for numerical computations and operations on arrays.
pandas (aliased as pd) is a library for data manipulation and analysis, providing data structures for handling datasets.
matplotlib.pyplot (aliased as plt) is a plotting library for creating static, animated, and interactive visualizations.
seaborn (aliased as sns) is a statistical data visualization library based on matplotlib, providing a high-level interface for creating informative graphics.

In [None]:
NikeAdidas = pd.read_csv('/kaggle/input/adidas-vs-nike/Adidas Vs Nike.csv')


This line of code reads a CSV file named 'Adidas Vs Nike.csv' from the specified Kaggle input directory and loads it into a pandas DataFrame named NikeAdidas.

In [None]:
NikeAdidas.describe()

In [None]:
NikeAdidas = NikeAdidas.drop(columns=['Product ID'], axis=1)

In [None]:
NikeAdidas['Last Visited'] = pd.to_datetime(NikeAdidas['Last Visited'])

This line of code converts the 'Last Visited' column in the NikeAdidas DataFrame to a datetime format.

In [None]:
NikeAdidas

In [None]:
# Create a boolean mask for rows where Listing Price is 0
price_zero_mask = NikeAdidas['Listing Price'] == 0

# Update the Listing Price with Sale Price where Listing Price was 0
NikeAdidas.loc[price_zero_mask, 'Listing Price'] = NikeAdidas.loc[price_zero_mask, 'Sale Price']

# Display the updated DataFrame
print(NikeAdidas)


Creates a boolean mask (price_zero_mask) that is True for each row in the NikeAdidas DataFrame where the 'Listing Price' is 0.
Uses this mask to update the 'Listing Price' column, replacing the value with the 'Sale Price' where the mask is True.
This operation is a common way to handle missing or zero values in a dataset, by replacing them with other available values, in this case, the sale price.

In [None]:
NikeAdidas.Brand.unique()

In [None]:
NikeAdidas['Brand']= NikeAdidas['Brand'].replace('Adidas Adidas ORIGINALS','Adidas ORIGINALS')
NikeAdidas.Brand.unique()

NikeAdidas['Brand']= NikeAdidas['Brand'].replace('Adidas Adidas ORIGINALS','Adidas ORIGINALS'): This line uses the replace method to change the brand name 'Adidas Adidas ORIGINALS' to 'Adidas ORIGINALS' in the 'Brand' column of the NikeAdidas DataFrame. This is a common operation to clean up data and ensure consistency in the way brand names are represented.

NikeAdidas.Brand.unique(): This line retrieves all unique values in the 'Brand' column of the NikeAdidas DataFrame. This is useful for understanding the different brands present in the dataset, which can be important for further analysis or for verifying that the data cleaning steps have been applied correctly

In [None]:
GroupedByBrands=NikeAdidas.groupby('Brand')
Nike = GroupedByBrands.get_group('Nike')
AdidasOriginals = GroupedByBrands.get_group('Adidas ORIGINALS')
AdidasNeo = GroupedByBrands.get_group('Adidas CORE / NEO')
AdidasSports = GroupedByBrands.get_group('Adidas SPORT PERFORMANCE')

# Combine the Adidas groups into one DataFrame
AdidasFrames = [AdidasOriginals, AdidasNeo, AdidasSports]
Adidas = pd.concat(AdidasFrames)
Adidas.reset_index(inplace=True, drop=True)

GroupedByBrands=NikeAdidas.groupby('Brand'): It groups the NikeAdidas DataFrame by the 'Brand' column, creating a GroupBy object for further operations.

Nike = GroupedByBrands.get_group('Nike'): It retrieves the subset of the DataFrame for the 'Nike' brand.

AdidasOriginals = GroupedByBrands.get_group('Adidas ORIGINALS'): It retrieves the subset for the 'Adidas ORIGINALS' brand.

AdidasNeo = GroupedByBrands.get_group('Adidas CORE / NEO'): It retrieves the subset for the 'Adidas CORE / NEO' brand.

AdidasSports = GroupedByBrands.get_group('Adidas SPORT PERFORMANCE'): It retrieves the subset for the 'Adidas SPORT PERFORMANCE' brand.

AdidasFrames = [AdidasOriginals, AdidasNeo, AdidasSports]: It creates a list of DataFrames containing the Adidas subsets.

Adidas = pd.concat(AdidasFrames): It concatenates these DataFrames into a single DataFrame, Adidas, which contains all the rows from the individual Adidas subsets.

Adidas.reset_index(inplace=True, drop=True): It resets the index of the Adidas DataFrame, dropping the old index, and does so in place without creating a new DataFrame.

The output is a DataFrame Adidas that combines all the Adidas-branded products from the original NikeAdidas DataFrame into a single DataFrame, with a reset index.

In [None]:
GroupedByBrands = NikeAdidas.groupby('Brand')['Discount']
AvgDis = GroupedByBrands.mean()
plt.figure(figsize=(10,6))
plt.pie(AvgDis.values, autopct="%.1f%%", explode=[0.10]*4, labels=AvgDis.index)
plt.title('Discounts offered by Brands')

Based on the pie chart, Adidas seems to be the better brand in terms of offering discounts to its customers. Offering discounts is a strategy that can increase sales and customer loyalty, as it provides an incentive for consumers to purchase products. While Nike does not offer any discounts, this could potentially limit its market reach or sales volume. However, it's important to consider other factors such as product quality, brand reputation, and customer service when determining the overall best brand.

In [None]:
# Count the number of products for Nike and Adidas
ProductsCount = {
    'Nike': Nike['Product Name'].value_counts().count(),
    'Adidas': Adidas['Product Name'].value_counts().count()
}

# Create a bar plot for the number of products
sns.set(rc={'figure.figsize':(7,5)})
palette = ["blue" if brand == "Nike" else "red" for brand in ProductsCount.keys()]
sns.barplot(x=list(ProductsCount.keys()), y=list(ProductsCount.values()), palette=palette)
plt.xlabel('Brands')
plt.ylabel('Number of Products offered')
plt.title('Products offered By Brands')

Based on the bar graph data, Adidas offers a significantly larger product range than Nike, with over 1000 products compared to Nike's 400 products. This could suggest that Adidas may have a broader appeal and market presence, potentially reaching a wider consumer base. However, it's important to consider other factors such as the quality and popularity of each product, as well as the brand's marketing and customer service strategies, to determine which brand is "better" overall.

In [None]:
# Create a bar plot for the most common Adidas products
AdidasTopProducts = Adidas['Product Name'].value_counts().head()
colors = sns.color_palette("hsv", len(AdidasTopProducts))
sns.set(rc={'figure.figsize':(11.7,8.27)})
sns.set_theme(style='whitegrid')
sns.barplot(x=AdidasTopProducts.index, y=AdidasTopProducts, palette=colors)
plt.xlabel('Product Name')
plt.ylabel('Count')
plt.title('Common Adidas Products Manufactured')

In [None]:
# Create a bar plot for the most common Nike products
NikeTopProducts = Nike['Product Name'].value_counts().head()
colors = sns.color_palette("hsv", len(NikeTopProducts))
sns.set(rc={'figure.figsize':(11.7,8.27)})
sns.set_theme(style='whitegrid')
sns.barplot(x=NikeTopProducts.index, y=NikeTopProducts, palette=colors)
plt.xlabel('Product Name')
plt.ylabel('Count')
plt.title('Common Nike Products Manufactured')

In [None]:
# Print descriptive statistics for Rating and Reviews for both Nike and Adidas
print("\nDescriptive Statistics for Nike:\n")
print(Nike[['Rating', 'Reviews']].describe())
print("\nDescriptive Statistics for Adidas:\n")
print(Adidas[['Rating', 'Reviews']].describe())

In [None]:
HighestRatedProductNike = Nike[Nike.Rating == Nike.Rating.max()]['Product Name']
HighestRatedProductAdidas = Adidas[Adidas.Rating == Adidas.Rating.max()]['Product Name']

In [None]:
# Print the names of the highest rated products in Nike
print("\nHighest Rated Product for Nike:\n")
for product in HighestRatedProductNike:
    print(product)

In [None]:
# Print the names of the highest rated products in Adidas
print("\nHighest Rated Product for Adidas:\n")
for product in HighestRatedProductAdidas:
    print(product)

Based on the data and analysis conducted in this Kaggle notebook, Adidas appears to have a broader product range with over 1000 products compared to Nike's 400 products. This suggests that Adidas may have a more extensive market presence and a wider appeal to consumers. However, it's important to note that product range is just one aspect of a brand's performance and does not necessarily translate to market dominance or consumer satisfaction.

Adidas' larger product range could indicate a more diversified offering, which may attract a wider customer base. However, the analysis also revealed that all Adidas products were offering some form of discount, while Nike did not offer any. This could be seen as a strategic advantage for Adidas in terms of pricing, which may influence consumer purchasing decisions.

In conclusion, while Adidas seems to have a larger product range and offers discounts, the overall brand performance should be evaluated based on additional factors such as product quality, customer satisfaction, and marketing strategies. Both brands have their strengths and areas for improvement, and the choice between them would depend on individual consumer preferences and brand loyalty.