<center>
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/assets/logos/SN_web_lightmode.png" width="300" alt="cognitiveclass.ai logo">
</center>

# __Customer Personality Analysis__

## Lab. 2. Data Wrangling

Estimated time needed: **30** minutes

## About dataset
<details><summary>Click here for the details about the dataset</summary>

<b>Attributes</b>

<b>People</b>

* ID: Customer's unique identifier
* Year_Birth: Customer's birth year
* Education: Customer's education level
* Marital_Status: Customer's marital status
* Income: Customer's yearly household income
* Kidhome: Number of children in customer's household
* Teenhome: Number of teenagers in customer's household
* Dt_Customer: Date of customer's enrollment with the company
* Recency: Number of days since customer's last purchase
* Complain: 1 if the customer complained in the last 2 years, 0 otherwise

<b>Products</b>

* MntWines: Amount spent on wine in last 2 years
* MntFruits: Amount spent on fruits in last 2 years
* MntMeatProducts: Amount spent on meat in last 2 years
* MntFishProducts: Amount spent on fish in last 2 years
* MntSweetProducts: Amount spent on sweets in last 2 years
* MntGoldProds: Amount spent on gold in last 2 years

<b>Promotion</b>

* NumDealsPurchases: Number of purchases made with a discount
* AcceptedCmp1: 1 if customer accepted the offer in the 1st campaign, 0 otherwise
* AcceptedCmp2: 1 if customer accepted the offer in the 2nd campaign, 0 otherwise
* AcceptedCmp3: 1 if customer accepted the offer in the 3rd campaign, 0 otherwise
* AcceptedCmp4: 1 if customer accepted the offer in the 4th campaign, 0 otherwise
* AcceptedCmp5: 1 if customer accepted the offer in the 5th campaign, 0 otherwise
* Response: 1 if customer accepted the offer in the last campaign, 0 otherwise

<b>Place</b>

* NumWebPurchases: Number of purchases made through the company’s website
* NumCatalogPurchases: Number of purchases made using a catalogue
* NumStorePurchases: Number of purchases made directly in stores
* NumWebVisitsMonth: Number of visits to company’s website in the last month

<b>Target:</b> Influence of customer personality on the amount of wine purchased.
</details>

## Objectives

After completing this lab you will be able to:

*   Handle missing values
*   Correct data format
*   Standardize and normalize data


<h2>Table of Contents</h2>

<div class="alert alert-block alert-info" style="margin-top: 20px">
<ol>
    <li><a href="#id1">Identify and handle missing values</a>
        <ul>
            <li><a href="#id1.1">Identify missing values</a></li>
            <li><a href="#id1.2">Deal with missing values</a></li>
            <li><a href="#id1.3">Correct data format</a></li>
        </ul>
    </li>
    <li><a href="#id2">Data standardization</a></li>
    <li><a href="#id3">Data normalization (centering/scaling)</a></li>
    <li><a href="#id4">Additional columns</a></li>
    <li><a href="#id5">Binning</a></li>
    <li><a href="#id6">Sorting</a></li>
    <li><a href="#id7">Grouping</a></li>
</ol>

</div>

<hr>


<h1 id="data_acquisition">Data Acquisition</h1>
<details><summary>Click here for the details about the format, source and license of the dataset</summary>
<br>
<p>
There are various formats for a dataset: .csv, .json, .xlsx  etc. The dataset can be stored in different places, on your local machine or sometimes online.<br>

In this section, you will learn how to load a dataset into our Jupyter Notebook.<br>

In our case, the Customer Personality Dataset is an online source, and it is in a CSV (comma separated value) format. Let's use this dataset as an example to practice data reading.

<ul>
    <li>Data source: <a href="https://www.kaggle.com/datasets/imakash3011/customer-personality-analysis?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkGuidedProjectsIBMSkillsNetworkGPXX0TS8EN2679-2023-01-01">https://www.kaggle.com/datasets/imakash3011/customer-personality-analysis</a></li>
    <li>Data type: csv</li>
    <li>License: <a href="https://creativecommons.org/publicdomain/zero/1.0/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkGuidedProjectsIBMSkillsNetworkGPXX0TS8EN2679-2023-01-01">CC0: Public Domain</a>
</ul>
        
The Pandas Library is a useful tool that enables us to read various datasets into a dataframe; our Jupyter notebook platforms have a built-in <b>Pandas Library</b> so that all we need to do is import Pandas without installing.
</p>
</details>


<h2>What is the purpose of data wrangling?</h2>


Data wrangling is the process of converting data from the initial format to a format that may be better for analysis.


<h4>Import pandas</h4> 


In [ ]:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
pd.set_option('display.precision', 2)

<h2>Reading the dataset from the URL</h2>


First, we assign the URL of the dataset to "filename".


In [ ]:
filename = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMSkillsNetwork-GPXX0TS8EN/marketing_campaign.csv"

In [ ]:
df = pd.read_csv(filename, sep="\t", index_col=0)

Use the method <b>iloc[a:b]</b> to display the range of rows of the dataframe with index from a to b.


In [ ]:
df.iloc[10:30]

As we can see, several NaN values appeared in the dataframe; those are missing values which may hinder our further analysis.

<div>So, how do we identify all those missing values and deal with them?</div> 

<b>How to work with missing data?</b>

Steps for working with missing data:

<ol>
    <li>Identify missing data</li>
    <li>Deal with missing data</li>
    <li>Correct data format</li>
</ol>


<b style="font-size: 25px; font-weight: bold; text-decoration: none;"><a name="id1"><font color="black">Identify and handle missing values</font></a></b>

<b style="font-size: 19px; text-decoration: none;"><a name="id1.1"><font color="black">Identify missing values</font></a></b>
<p>In the customer personality dataset, missing data comes as a NaN value. In other datasets it can come with the question mark "?". 
In this case we would replace "?" with NaN (Not a Number), Python's default missing value marker for reasons of computational speed and convenience. Here we use the function: 
 <pre>.replace(A, B, inplace = True) </pre>
to replace A by B.</p>


<h4>Evaluating for Missing Data</h4>

The missing values are converted by default. We use the following functions to identify these missing values. There are two methods to detect missing data:

<ol>
    <li><b>.isnull()</b></li>
    <li><b>.notnull()</b></li>
</ol>
The output is a boolean value indicating whether the value that is passed into the argument is in fact missing data.


In [ ]:
missing_data = df.isnull()
missing_data[10:30]

"True" means the value is a missing value while "False" means the value is not a missing value.


<h4>Count missing values in each column</h4>
<p>
Using a for loop in Python, we can quickly figure out the number of missing values in each column. As mentioned above, "True" represents a missing value and "False" means the value is present in the dataset.  In the body of the for loop the method ".value_counts()" counts the number of "True" values. 
</p>


In [ ]:
for column in missing_data.columns.values.tolist():
    print(column)
    print (missing_data[column].value_counts())
    print("")    

Based on the summary above, each column has 2240 rows of data and only one of the columns contains missing data:

	"Income": 24 missing data


<b style="font-size: 19px; text-decoration: none;"><a name="id1.2"><font color="black">Deal with missing data</font></a></b><br>
<b>How to deal with missing data?</b>

<ol>
    <li>Drop data<br>
        a. Drop the whole row<br>
        b. Drop the whole column
    </li>
    <li>Replace data<br>
        a. Replace it by mean<br>
        b. Replace it by frequency<br>
        c. Replace it based on other functions
    </li>
</ol>


Whole columns should be dropped only if most entries in the column are empty. In our dataset, none of the columns are empty enough to drop entirely.
We have some freedom in choosing which method to replace data; however, some methods may seem more reasonable than others. We will apply "Replace by mean" method to "Income" column.


<h4>Calculate the mean value for the "Income" column </h4>


In [ ]:
avg_income = df["Income"].astype("float").mean(axis=0)
print("Average of income: {:.2f}".format(avg_income))

<h4>Replace "NaN" with mean value in "Income" column</h4>


In [ ]:
df["Income"].replace(np.nan, avg_income, inplace=True)

In [ ]:
df["Income"][10:30]

<b>Good!</b> Now, we have a dataset with no missing values.


<h3 id="correct_data_format">Correct data format</h3>
<b style="font-size: 19px; text-decoration: none;"><a name="id1.3"><font color="black">Correct data format</font></a></b>
<b>We are almost there!</b>
<p>The last step in data cleaning is checking and making sure that all data is in the correct format (int, float, text or other).</p>

In Pandas, we use:

<p><b>.dtype()</b> to check the data type</p>
<p><b>.astype()</b> to change the data type</p>


<h4>Let's list the data types for each column</h4>


In [ ]:
df.dtypes

<p>As we can see above, all columns have the correct data type. Numerical variables should have type 'float' or 'int', and variables with strings such as categories should have type 'object'. For example, 'Year_Birth' variable has numerical values that indicates customer's birth year, so we should expect it to be of the type 'float' or 'int' and so it is. If variable has incorrect data type we have to convert data types into a proper format using the "astype()" method.</p> 


<b>Wonderful!</b>

Now we have finally obtained the cleaned dataset with no missing values with all data in its proper format.


<b style="font-size: 25px; font-weight: bold; text-decoration: none;"><a name="id2"><font color="black">Data Standardization</font></a></b>
<p>
Data is usually collected from different agencies in different formats.
(Data standardization is also a term for a particular type of data normalization where we subtract the mean and divide by the standard deviation.)
</p>

<b>What is standardization?</b>

<p>Standardization is the process of transforming data into a common format, allowing the researcher to make the meaningful comparison.
</p>

<b>Example</b>

<p>In our dataset, we have column "Marital_Status". Let's see what unique values does it have.</p>


In [ ]:
df['Marital_Status'].unique()

We can say that 'Single' and 'Alone' means the same. So let's replace all 'Alone' values with 'Single'.


In [ ]:
df['Marital_Status'].replace('Alone', 'Single', inplace=True)
df['Marital_Status'].unique()

We added "inplace=True" parameter to apply changes to dataframe. As we can see there is no 'Alone' values anymore.


<b style="font-size: 25px; font-weight: bold; text-decoration: none;"><a name="id3"><font color="black">Data Normalization</font></a></b>

<b>Why normalization?</b>

<p>Normalization is the process of transforming values of several variables into a similar range. Typical normalizations include scaling the variable so the variable average is 0, scaling the variable so the variance is 1, or scaling the variable so the variable values range from 0 to 1.
</p>

<b>Example</b>

<p>To demonstrate normalization, let's say we want to scale the columns "MntMeatProducts" and "MntGoldProds".</p>
<p><b>Target:</b> would like to normalize those variables so their value ranges from 0 to 1</p>
<p><b>Approach:</b> replace original value by (original value)/(maximum value)</p>


In [ ]:
# replace (original value) by (original value)/(maximum value)
MntMeatProducts_norm = df['MntMeatProducts'] / df['MntMeatProducts'].max()
MntGoldProds_norm = df['MntGoldProds'] / df['MntGoldProds'].max()

In [ ]:
MntMeatProducts_norm.head()

We can see that normalized values are in range from 0 to 1, but again output format is not very pretty. Let's change it.


In [ ]:
pd.set_option('display.float_format', lambda x: '%.4f' % x)
MntMeatProducts_norm.head()

Now we have rounded results.


That is not the only way to normalize data. For example, we can use "sklearn" library to do it. 


In [ ]:
from sklearn import preprocessing

scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
d = scaler.fit_transform(df[['MntMeatProducts']])
print(d)

As we can see, the results are the same.


<div class="alert alert-danger alertdanger" style="margin-top: 20px">
<b style="font-size: 2em; font-weight: bold;"> Question #1: </b>

<b>According to the example above, normalize the column "MntFruits".</b>

</div>


In [ ]:
# Write your code below and press Shift+Enter to execute 


<details><summary>Click here for the solution</summary>

```python
df['MntFruits'] = df['MntFruits'] / df['MntFruits'].max() 

# show the scaled columns
df[["MntWines", "MntGoldProds", "MntFruits"]].head()


```

</details>


Here we can see we've normalized "MntMeatProducts", "MntGoldProds" and "MntFruits" in the range of \[0,1].


<b style="font-size: 25px; font-weight: bold; text-decoration: none;"><a name="id4"><font color="black">Additional columns</font></a></b>


<h3>Calculate new data based on given values</h3>


Let's find average montly income. To do this we need to add write in new column value which equals to "Income" / 12


In [ ]:
df["Average Income"] = df["Income"] / 12
df["Average Income"].head()

Here we can see how much customer earns per month.


<div class="alert alert-danger alertdanger" style="margin-top: 20px">
<b style="font-size: 2em; font-weight: bold;"> Question #2: </b>

<b>According to the example above, add new column "Average annual spent on wine" which equals "MntWines" / 2.</b>

</div>


In [ ]:
# Write your code below and press Shift+Enter to execute 


<details><summary>Click here for the solution</summary>

```python
df["Average annual spent on wine"] = df["MntWines"] / 2
df["Average annual spent on wine"].head()
```

</details>


<b style="font-size: 25px; font-weight: bold; text-decoration: none;"><a name="id5"><font color="black">Binning</font></a></b><br>
<b>Why binning?</b>
<p>
    Binning is a process of transforming continuous numerical variables into discrete categorical 'bins' for grouped analysis.
</p>

<b>Example: </b>

<p>In our dataset, "MntWines" is a real valued variable ranging from 0 to 1493 and it has 776 unique values. It is amount of money spent by customer on wine in last 2 years. What if we only need to know how much do customers spend on a wine: little, less than average, more than average or a lot (4 types)? Can we rearrange them into four ‘bins' to simplify analysis? </p>

<p>We will use the pandas method 'cut' to segment the 'MntWines' column into 4 bins.</p>


<h3>Example of Binning Data In Pandas</h3>


Let's plot the histogram of NumDealsPurchases to see what the distribution of NumDealsPurchases looks like.


In [ ]:
%matplotlib inline
import matplotlib as plt
from matplotlib import pyplot
plt.pyplot.hist(df["MntWines"])

# set x/y labels and plot title
plt.pyplot.xlabel("Amount spent on wine")
plt.pyplot.ylabel("Customer count")
plt.pyplot.title("Bins")

<p>We would like 3 bins of equal size bandwidth so we use numpy's <code>linspace(start_value, end_value, numbers_generated</code> function.</p>
<p>Since we want to include the minimum value of NumDealsPurchases, we want to set start_value = min(df["MntWines"]).</p>
<p>Since we want to include the maximum value of NumDealsPurchases, we want to set end_value = max(df["MntWines"]).</p>
<p>Since we are building 4 bins of equal length, there should be 5 dividers, so numbers_generated = 5.</p>


We build a bin array with a minimum value to a maximum value by using the bandwidth calculated above. The values will determine when one bin ends and another begins.


In [ ]:
bins = np.linspace(min(df["MntWines"]), max(df["MntWines"]), 5)
bins

We set group  names:


In [ ]:
group_names = ['Little', 'Less', 'More', 'A lot']

We apply the function "cut" to determine what each value of `df['MntWines']` belongs to.


In [ ]:
df['MntWines binned'] = pd.cut(df['MntWines'], bins, labels=group_names, include_lowest=True)
df[['MntWines', 'MntWines binned']].head(20)

Let's see the number of customers in each bin:


In [ ]:
df["MntWines binned"].value_counts()

Let's plot the distribution of each bin:


In [ ]:
pyplot.bar(group_names, df["MntWines binned"].value_counts())

# set x/y labels and plot title
plt.pyplot.xlabel("Amount spent on wine")
plt.pyplot.ylabel("Customer count")
plt.pyplot.title("Bins")

<p>
    Look at the dataframe above carefully. You will find that the last column provides the bins for "MntWines" based on 4 categories ("Little", "Less than average", "More than average" and "A lot"). 
</p>
<p>
    We successfully narrowed down the intervals from 776 to 4!
</p>


<h3>Bins Visualization</h3>
Normally, a histogram is used to visualize the distribution of bins we created above. 


In [ ]:
# draw historgram of attribute "MntWines" with bins = 4
plt.pyplot.hist(df["MntWines"], bins=4)

# set x/y labels and plot title
plt.pyplot.xlabel("Amount spent on wine")
plt.pyplot.ylabel("Customer count")
plt.pyplot.title("Bins")

The plot above shows the binning result for the attribute "MntWines".


<b style="font-size: 25px; font-weight: bold; text-decoration: none;"><a name="id6"><font color="black">Sorting</font></a></b>


<p>In our dataset we have such colums as "Year_Birth", "Income", "Education", "Marital_Status". We will sort them in different ways using the <code>sort_values()</code> function.</p>


Let's sort customers by age. The older customer, the lower "Year_Birth" value will be.


In [ ]:
df.sort_values("Year_Birth")

Here we can see that customers are sorted from oldest to youngest.


Let's sort customers by their income. We want it to be in descending order. To do that we need to use "ascending=False" parameter.


In [ ]:
df.sort_values("Income", ascending=False)

Here we can see that the first customer earn the most and the last one the least.


We also can sort by multiple parameters. Let's say we want to see customers from youngest to oldest. If they are the same age the customer who has higher income should be displayed first. Here is how we can do it:


In [ ]:
df.sort_values(["Year_Birth", "Income"], ascending=False)

Let's sort customers by their Education. We want the customers with higher education qualification to be displayed first. If we sort as before result will be based on such parameters as the length of value of "Education" column or if there is number etc.


In [ ]:
df.sort_values("Education")

To do it correctly we can use "key" parameter and assign to it function which will be responsible for order in which values are displayed. Firstly we will get all posiible values of "Educatin" using <code>unique()</code> function.


In [ ]:
df["Education"].unique()

Now we will create "sort_by_education" function which will take "x" as a parameter. x is the Series of values of "Education" column. Inside the function we create dictionary where key is the education qualification and value is the number which indicates the order of education qualification. Then we use <code>map()</code> function to perform operation.


In [ ]:
def sort_by_education(x):
    d = {'PhD': 1, 'Master': 2, '2n Cycle': 3, 'Graduation': 4, 'Basic': 5}
    return x.map(d)


df.sort_values(by="Education", key=sort_by_education)

Here we can see that customers who have PhD are displayed first and customers with basic education last.


<div class="alert alert-danger alertdanger" style="margin-top: 20px">
<b style="font-size: 2em; font-weight: bold;"> Question #3: </b>

<b>According to the example above, sort customers by their marital status. We want it to be in the following order: Single, Together, Married, Divorced, Widow, Absurd, YOLO.</b>

</div>


In [ ]:
# Write your code below and press Shift+Enter to execute 


<details><summary>Click here for the solution</summary>

```python
def sort_by_marital_status(x):
    d = {'Single': 1, 'Together': 2, 'Married': 3, 'Divorced': 4, 'Widow': 5, 'Absurd': 6, 'YOLO': 7}
    return x.map(d)

df.sort_values(by="Marital_Status", key=sort_by_marital_status)

```

</details>


<b style="font-size: 25px; font-weight: bold; text-decoration: none;"><a name="id7"><font color="black">Grouping</font></a></b>


Let's find out how marital status of customer and number of children they have are related.


In [ ]:
grouped_data = df.groupby(['Marital_Status', 'Kidhome']).size().reset_index(name='count')

# use pivot_table to transform the data into the desired format
pivot_data = pd.pivot_table(grouped_data, values='count', index='Marital_Status', columns='Kidhome', fill_value=0)
print(pivot_data)

Let's break down what we did here. 
<p><code>df.groupby(['Marital_Status', 'Kidhome'])</code> creates a groupby object by grouping df by two columns: 'Marital_Status' and 'Kidhome'.</p>

<p><code>.size().reset_index(name='count')</code> returns the size of each group and then resets the index of the resulting Series (which has 'Marital_Status' and 'Kidhome' as the multi-level index), and sets the name of the resulting column to 'count'.</p>

<p>Thwn we use<code>pivot_table()</code> to display results in understandable format.</p>


Let's say we want to see relationship between age of customer and his education and marital status. We can use <code>crosstab()</code> function to do it. Also we will use <code>cut()</code> function to cut 'Year_Birth' in ranges. 


In [ ]:
education_tab = pd.crosstab(pd.cut(df['Year_Birth'], [1940, 1950, 1960, 1970, 1980, 1990, 2000]), df['Education'], rownames=['Year_Birth'], colnames=['Education'])
print(education_tab)

<div class="alert alert-danger alertdanger" style="margin-top: 20px">
<b style="font-size: 2em; font-weight: bold;"> Question #4: </b>

<b>According to the example above, create a crosstable 'marital_status_tab' with 'Year_Birth' and 'Marital_Status'</b>

</div>


In [ ]:
# Write your code below and press Shift+Enter to execute 


<details><summary>Click here for the solution</summary>

```python
marital_status_tab = pd.crosstab(pd.cut(df['Year_Birth'], [1940, 1950, 1960, 1970, 1980, 1990, 2000]), df['Marital_Status'], rownames=['Year_Birth'], colnames=['Marital_Status'])
print(marital_status_tab)

```

</details>


To display tables together we can use <code>concat()</code> function. We will pass as parameters list of tables we want to concatenate and the axis to concatenate along.


In [ ]:
pd.concat([education_tab, marital_status_tab], axis=1)

Let's see how education qualification of customers influences their yearly household income. The first argument will be the column and second - the list of values that should be the limit.


In [ ]:
df['Income binned'] = pd.cut(df['Income'], [1000, 5000, 20000, 40000, 70000, 100000, float("inf")])
df['Income binned']

In [ ]:
grouped_data = df.groupby(['Education', df['Income binned']]).size().reset_index(name='count')

pivot_data = pd.pivot_table(grouped_data, values='count', index='Education', columns='Income binned', fill_value=0)
print(pivot_data)

As expected we can see that people with higher education earn more.


<div class="alert alert-danger alertdanger" style="margin-top: 20px">
<b style="font-size: 2em; font-weight: bold;"> Question #5: </b>

<b>According to the examples above, group customers by their yearly household income and amount spent on wine. To make it more understandable here is how the rows and columns should look like:</b>
<ul>
    <li>Rows: 1000-5000, 5001-20000, 20001-40000, 40001-70000, 70001-100000, and >=100001.</li>
    <li>Columns: 1-100, 101-200, 201-500, 501-800, 801-1100, 1101-1500</li>
</ul>
    
<b>Hint: use "Average annual spent on wine" column</b>

</div>


In [ ]:
# Write your code below and press Shift+Enter to execute 


<details><summary>Click here for the solution</summary>

```python
grouped_data = df.groupby([pd.cut(df['Income'], [1000, 5000, 20000, 40000, 70000, 100000, float("inf")]), 
                           pd.cut(df['Average annual spent on wine'], [100, 200, 500, 800, 1100, 1500])]).size().reset_index(name='count')

pivot_data = pd.pivot_table(grouped_data, values='count', index='Income', columns='Average annual spent on wine', fill_value=0)
print(pivot_data)

```

</details>


In [ ]:
df.to_csv('clean_df.csv')

Save the new csv:

> Note : The  csv file cannot be viewed in the jupyterlite based SN labs environment.However you can Click <a href="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DA0101EN-SkillsNetwork/labs/Module%202/DA0101EN-2-Review-Data-Wrangling.ipynb?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkDA0101ENSkillsNetwork20235326-2022-01-01">HERE</a> to download the lab notebook (.ipynb) to your local machine and view the csv file once the notebook is executed.


### Thank you for completing this lab!


## Authors


Developer: [Yaroslav Vyklyuk, prof., PhD., DrSc](http://vyklyuk.bukuniver.edu.ua/en/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkGuidedProjectsIBMSkillsNetworkGPXX0TS8EN2679-2023-01-01)


Retail Consultant: [Olha Vdovichena, ass. prof, PhD](https://scholar.google.ru/citations?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkGuidedProjectsIBMSkillsNetworkGPXX0TS8EN2679-2023-01-01&user=3vIQ33YAAAAJ&hl=uk)


Developer: [Stepan Sarabun](https://author.skills.network/instructors/stepan_sarabun_2?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkGuidedProjectsIBMSkillsNetworkGPXX0TS8EN2679-2023-01-01)


Copyright &copy; 2023 IBM Corporation. This notebook and its source code are released under the terms of the [MIT License](https://cognitiveclass.ai/mit-license/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkGuidedProjectsIBMSkillsNetworkGPXX0TS8EN2679-2023-01-01).
