## Agenda:
1. [Introduction](#1)
    - 1.1 [Problem Statement](#2)
    - 1.2 [Columns Description](#3)
    - 1.3 [Challenges](#4)
2. [Data Preparation & Cleaning](#5)
    - 2.1 [Packages & Helping Functions](#6)
    - 2.2 [Data Loading & Cleaning](#7) 
3. [General Exploration](#8)
4. [Answering The Proposed Questions](#9)
5. [Conclusion](#10)

<h2 align='center'><font color='#290066'>1. Introduction</font></h2><a id=1></a>

### 1.1 Problem Statement <a id=2></a>
> Nelson Mandela believed education was the most powerful weapon to change the world. The COVID-19 Pandemic has disrupted learning for more than 56 million students in the United States. In the Spring of 2020, most states and local governments across the U.S. closed educational institutions to stop the spread of the virus. In response, schools and teachers have attempted to reach students remotely through distance learning tools and digital platforms. Yet not every student has equal opportunities to learn. Effective policies and plans need to be enacted in order to make education more equitable.

### 1.2 Columns Description <a id=3></a>

<h2 align='center'><font color='#290066'>Engagement data</font></h2><a id=1></a>

| Column | Description |
| -:- | -:- |
| time | Date in "YYYY-MM-DD" |
| lp_id | The unique identifier of the product |
| pct_access | Percentage of students in the district have at least one page-load event of a given product and on a given day |
| engagement_index | Total page-load events per one thousand students of a given product and on a given day |


<h2 align='center'><font color='#290066'>District Information data</font></h2><a id=1></a>

| Column | Description |
| -:- | -:- |
| district_id | The unique identifier of the school district |
| state | The state where the district resides in |
| locale | NCES locale classification that categorizes U.S. territory into four types of areas: City, Suburban, Town, and Rural. See Locale Boundaries User's Manual for more information. |
| pct_black/hispanic | Percentage of students in the districts identified as Black or Hispanic based on 2018-19 NCES data |
| pct_free/reduced | Percentage of students in the districts eligible for free or reduced-price lunch based on 2018-19 NCES data |
| countyconnectionsratio | ratio (residential fixed high-speed connections over 200 kbps in at least one direction/households) based on the county level data from FCC From 477 (December 2018 version). See FCC data for more information. |
| pptotalraw | Per-pupil total expenditure (sum of local and federal expenditure) from Edunomics Lab's National Education Resource Database on Schools (NERD$) project. The expenditure data are school-by-school, and we use the median value to represent the expenditure of a given school district.|

<h2 align='center'><font color='#290066'>Product Information data</font></h2><a id=1></a>

| Column | Description |
| -:- | -:- |
| LP ID | The unique identifier of the product |
| URL | Web Link to the specific product |
| Product Name | Name of the specific product |
| Provider/Company Name | Name of the product provider |
| Sector(s) | Sector of education where the product is used |
| Primary Essential Function | The basic function of the product. There are two layers of labels here. Products are first labeled as one of these three categories: LC = Learning & Curriculum, CM = Classroom Management, and SDO = School & District Operations. Each of these categories have multiple sub-categories with which the products were labeled |

### 1.3 Challenges <a id=4></a>
> Exploring <br>
    (1) the state of digital learning in 2020 and <br>
    (2) how the engagement of digital learning relates to factors such as district demographics, broadband access, and state/national level policies and events.
   
**Below are some examples of questions that relate to the problem statement:**

   - What is the picture of digital connectivity and engagement in 2020?
   - What is the effect of the COVID-19 pandemic on online and distance learning, and how might this also evolve in the future?
   - How does student engagement with different types of education technology change over the course of the pandemic?
   - How does student engagement with online learning platforms relate to different geography? Demographic context (e.g., race/ethnicity, ESL, learning disability)? Learning context? Socioeconomic status?
   - Do certain state interventions, practices or policies (e.g., stimulus, reopening, eviction moratorium) correlate with the increase or decrease online engagement?

<h2 align='center'><font color='#290066'>2. Data Preparation & Cleaning</font></h2><a id=1></a><a id=5></a>

### 2.1 Packages & Some Helping Functions <a id=6></a>

In [None]:
import os 

# For Loading and Manipulating data
import pandas as pd
import numpy as np

# For visualization purposes
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

# To change the style of the plots ( so that we all can see the same thing :) )
plt.style.use('seaborn')

# for coloring the printed output
from termcolor import colored

In [None]:
def show_missing(dataframe):
    """
    This function helps you to know the count and percentage of the missing values in each column in the given dataframe
    
    Args:
    dataframe: the dataframe we want to investigate the NaNs on
    
    Returns:
    missing_data: a dataframe which has:
                     1- index of the columns names 
                     2- two columns: the count and percentage of the NaNs of that column
    """
    missing_values = dataframe.isnull().sum()
    missing_data = pd.DataFrame({'Count':missing_values.values,
                                 'Percentage':(missing_values.values*100.00 / dataframe.shape[0]).round(2)}, 
                                  index=missing_values.index)
    
    
    return missing_data

#===========================================================================================================================#

def show_countplot(dataframe, x, y, hue=None, order=None, title='Title', x_rotation=0):
    """
    In addition to plotting countplot, This function adds the percentage of each category above its bar.
    
    Args:
    dataframe: the dataframe we want to work on
    x: if we want to work on x-axis
    y: if we want to work on y-axis
    hue: if we want to add a third feature
    order: to sepecify the order the bars will be displayed by
    title: the title of the plot
    x_rotation: to rotate x ticks if we wanted

    """  
    ax = sns.countplot(data=dataframe, x=x, y=y, order=order, hue=hue, color='#4d83de')
   
    plt.title(title, fontsize=20, color='brown')
    
    plt.xlabel(x.title() if x else 'Count', fontsize=15)
    plt.xticks(rotation=x_rotation, fontsize=12)
    plt.ylabel(y.title() if y else 'Count', fontsize=15)
    plt.yticks(fontsize=12)
    
    total = dataframe.shape[0]
    for patch in ax.patches:
        loc = patch.get_x() if x else patch.get_y()
        width = patch.get_width()
        height = patch.get_height()
        
        # text location
        loc_x = (loc+width/2) if x else (width*0.5)
        loc_y = (height*0.5) if x else (loc+height/2)
        
        # text
        percent = (height*100/total) if x else (width*100/total)
        
        ax.text(loc_x, loc_y, f'{percent:.2f}%', ha='center', weight='bold', fontsize=10, color='white')

#==============================================================================================================================#
#------------------------------------------------------------------------------------------------------------------------------#
# Note: the following functions may be a little vague right now but after seeing how we've used them every thing will be clear
#------------------------------------------------------------------------------------------------------------------------------#
#==============================================================================================================================#

def fill_it(product_name, sector, PEF):
    """
    Filling the specified row by its approximated value
    
    Args:
    product_name: to specify which row we're working on
    sector: the approximated value of the sector for this product
    PEF: the approximated value of the primary essential function for this product
    """
    global products_info
    products_info.loc[products_info['Product Name']==product_name, 'Sector(s)'] = sector
    products_info.loc[products_info['Product Name']==product_name, 'Primary Essential Function'] = PEF

    
def remove_it(product_name):
    """
    Removing the specified row
    
    Args:
    product_name: to specify which row we're working on
    """
    global products_info
    products_info = products_info[products_info['Product Name']!=product_name].copy()
    
#=================================================================================================================#

def eng_data():
    """
    A function to generate the files of engagement data folder
    """
    path = '../input/learnplatform-covid19-impact-on-digital-learning/engagement_data'
    files = os.listdir(path)

    for file in files:
        df = pd.read_csv(os.path.join(path,file))
        
        yield df
        

def fill_data(data):
    """
    A function that fill the NaNs of each generated file
    """
    # If the product id does not exist then the whole record is not important 
    data.dropna(subset=['lp_id'], inplace=True)
    
    # if the id does exist but there is no information about it then also the whole column is not important
    data.dropna(subset=['pct_access', 'engagement_index'], how='all', inplace=True)
    
    # if only one value in the row is missing we will fill it with "-1"
    data.fillna(-1, inplace=True)
    
#=============================================================================================================#

def show_plots(dataframe, feature, title='Title'): 
    """
    This function helps us in the visualization stage
    
    Args:
    dataframe: the dataframe we want to work on
    feature: the feature we want to visualize
    title: the title for the plot
    """
    fig = plt.figure(figsize=(25,35))
    fig.suptitle(title, fontsize=30, weight='bold', y=0.91)
    
    for i, product in enumerate(dataframe['Product Name'].unique()):
        data = dataframe[dataframe['Product Name']==product]
        data = data.groupby('month')[feature].mean()

        plt.subplot(5, 4, i+1)
        plt.plot(data.index, data.values)
        plt.title(product, color='brown')
        plt.xticks(ticks=range(1,13), labels=range(1, 13));  

#=================================================================================================================#

def check(product):
    """
    This function checks if the given series has all the months or not and if not it adds them.
    
    Args:
    product: the row that we're working on
    
    Returns:
    The modified Series (that has all the months in it)
    """
    missing_months = sorted(set(range(1, 13)) - set(product.index))
    product = product.values
    for mm in missing_months:
        product = np.insert(product, mm-1, 0)
        
    return pd.Series(product, index=range(1, 13))

#=================================================================================================================#

def investigate(feature, size, title, order=None):  
    """
    This function helps us in the visualization stage like 'show_plots' function
    
    Args:
    feature: the feature we want to plot
    size: the size of the plot
    title: the title of the plot
    order: the order of the bars
    """
    fig, axes = plt.subplots(nrows=3, ncols=1, figsize=size)
    fig.suptitle(title, fontsize=30, weight='bold', y=0.93)

    for i, subset in enumerate(to_investigate): 
        fig.sca(axes[i])
        sub_df = districts_info[districts_info['state'].isin(subset)].copy()

        show_countplot(sub_df, x=feature, y=None, order=order, title=titles[i])

### 2.2 Data Loading & Cleaning <a id=7>
<h2 align='center'><font color='#290066'>Products info</font></h2>

In [None]:
# reading the data
products_info = pd.read_csv('../input/learnplatform-covid19-impact-on-digital-learning/products_info.csv')
products_info.head()

In [None]:
# get more info
products_info.info()

In [None]:
# investigating the missing values
show_missing(products_info)

In [None]:
# Last but not least, Is there any duplicates?
products_info.duplicated().sum()

#### As we can see there are:
   - _Useless columns:_
       - URL
   
   - _Columns has NaNs:_
       - Provider/Company Name
       - Sector(s)
       - Primary Essential Function

#### <font color='red'>Let's fill the NaNs </font>

In [None]:
# let's take a closer look at these 20 NaNs
products_info[products_info['Sector(s)'].isnull()]

### We will try to fill the NaNs Manually by going to each URL and trying to estimate `Sector` and `Primary Essential Function`

In [None]:
# http://www.ixl.com/
fill_it('IXL Language', 'PreK-12', 'LC - Digital Learning Platforms')

# https://www.yelp.com/
# It's not something for learning
remove_it('Yelp')

# http://www.learnplatform.com/
fill_it('LearnPlatform', 'PreK-12; Corporate', 'SDO - Data, Analytics & Reporting - Student Information Systems (SIS)')

# http://genius.com/static/education
# It's not something for learning
remove_it('Education Genius')

# http://www.microsoft.com/en-us/education/products/office/default.aspx
fill_it('Microsoft Office 365', 'PreK-12; Higher Ed; Corporate', 'LC - Study Tools')

# http://www.classzone.com/cz/index.htm
# It has been retired and is no longer accessible
remove_it('ClassZone')

# http://student.classdojo.com/#/login
fill_it('ClassDojo for Students', 'PreK-12; Higher Ed', 'CM - Classroom Engagement & Instruction - Assessment & Classroom Response')

# https://play.google.com/music/listen?u=0#/sulp
# Google Play Music is no longer available
remove_it('Google Play Music')

# https://sciencejournal.withgoogle.com/
fill_it('Google Science Journal', 'PreK-12; Higher Ed', 'LC - Study Tools - Tutoring')

# https://edutrainingcenter.withgoogle.com/
# It is no longer available
remove_it('Google Training Center')

# https://info.flipgrid.com/
fill_it('Flipgrid One', 'PreK-12; Higher Ed; Corporate', 'CM - Virtual Classroom - Video Conferencing & Screen Sharing')

# https://spark.adobe.com/about/page
# It's not something for learning
remove_it('Adobe Spark Page')

# https://www.usnews.com/best-colleges/myfit
fill_it('College Compass', 'Higher Ed; Corporate', 'SDO - Data, Analytics & Reporting - Site Hosting & Data Warehousing')

# https://chrome.google.com/webstore/detail/grammarly-for-chrome/kbfnbcaeplbcioakkpcpgfkobkghlhen?hl=en
fill_it('Grammarly for Chrome', 'PreK-12; Higher Ed; Corporate', 'LC - Content Creation & Curation')

# https://www.maxpreps.com/state/connecticut.htm
# It is no longer available
remove_it('MaxPreps: Connecticut')

# https://www.ducksters.com/history/
fill_it('History for Kids', 'PreK-12', 'LC - Digital Learning Platforms')

# https://safeyoutube.net/
# It is no longer accessible
remove_it('SafeYouTube')

# https://studio.code.org
fill_it('Studio Code', 'PreK-12; Higher Ed', 'LC - Sites, Resources & Reference - Games & Simulations')

# http://edpuzzle.com
fill_it('Edpuzzle - Free (Basic Plan)', 'PreK-12; Higher Ed; Corporate', 'LC - Sites, Resources & Reference - Digital Collection & Repository')

# http://www.truenorthlogic.com/
# It is no longer accessible
remove_it('True North Logic')

> Done :)

- **Check**

In [None]:
show_missing(products_info)

> Perfect!

In [None]:
# Remove useless columns
products_info.drop('URL', axis=1, inplace=True)

> Very Good. Now our "product_info" is ready.

<h2 align='center'><font color='#290066'>Districts Info</font></h2>

In [None]:
# reading the data
districts_info = pd.read_csv('../input/learnplatform-covid19-impact-on-digital-learning/districts_info.csv')
districts_info.head()

In [None]:
# get more info
districts_info.info()

In [None]:
show_missing(districts_info)

In [None]:
districts_info.duplicated().sum()

#### As we can see NaNs is the only problem here so <font color='red'>Let's fill the NaNs </font>

In [None]:
# let's drop the rows that has NaNs in all the columns except for the district id
districts_info = districts_info.dropna(how='all', subset=['state', 'locale', 'pct_black/hispanic', 'pct_free/reduced',
                                                          'county_connections_ratio', 'pp_total_raw']) 

Let's fill the remaining NaNs with <font color='red'>"Unknown" </font>value in order not to lose the information in the other columns

**Note:** <br>
  > we've tried to fill the NaNs based on the combination of <font color='#008000'>['state', 'locale', 'pct_black/hispanic']</font> but it didn't work
**Just for more clarity this is the piece of code we've used while trying to fill the NaNs in "pct_free/reduced" column by the above method:**

```
#let's get the existing combinations of these three columns in our data.
#geting the different combinations
NaNs_df = districts_info[districts_info['pct_free/reduced'].isnull()]
NaNs_df = NaNs_df[['state', 'locale', 'pct_black/hispanic']].copy()
NaNs_df.drop_duplicates(inplace=True)   
#now let's fill the NaNs according to these combinations
for i in range(NaNs_df.shape[0]):
    state, locale, pct = NaNs_df.iloc[i, :].values
    mask = (districts_info['state']==state)&(districts_info['locale']==locale)&(districts_info['pct_black/hispanic']==pct)
    value = districts_info.loc[mask, 'pct_free/reduced'].dropna().mode()
    districts_info.loc[mask&districts_info['pct_free/reduced'].isnull(), 'pct_free/reduced'] = value
```

As we discussed above ... let's fill the NaNs with <font color='red'>"Unknown" </font> Value

In [None]:
districts_info.fillna("Unknown", inplace=True)

- **Check**

In [None]:
show_missing(districts_info)

> Great! :)

<h2 align='center'><font color='#290066'>Engagement Data</font></h2>

In [None]:
to_merge = []           # putting the engagement data in one place for concatenation
eng_df = eng_data()     # defining the generator

In [None]:
for data in eng_df:
    fill_data(data)            # filling the NaNs
    to_merge.append(data)
    
all_eng = pd.concat(to_merge)

In [None]:
all_eng.head()

In [None]:
all_eng.info()

In [None]:
# Check
show_missing(all_eng)

In [None]:
# Checking duplicates
all_eng.duplicated().sum()

#### As we can see there are:  
   - _Columns has wrong datatype:_
       - time
   
   - _Duplicates:_
       - 5056710 duplicated rows

Let's convert the type of "time" column from `object` to `datetime`

In [None]:
all_eng['time'] = pd.to_datetime(all_eng['time'])

- **Check**

In [None]:
all_eng.info()

In [None]:
# droping duplicates
all_eng.drop_duplicates(inplace=True)

- **Check**

In [None]:
# duplicates
all_eng.duplicated().sum()

> Great, Now we're **ready** for some exploration :).

<h3><font color='#000099'> Before doing anything, Let's first combine "all_eng" and "products_info" datasets to have more info about each product. </font></h3>

In [None]:
eng_product_merge = pd.merge(all_eng, products_info, left_on='lp_id',right_on='LP ID',how='inner').drop(columns='LP ID')
eng_product_merge.head()

In [None]:
eng_product_merge.info()

extracting the `month` from "time" feature

In [None]:
eng_product_merge['month'] = eng_product_merge['time'].dt.month
eng_product_merge.head()

<h2 align='center'>Before anwering any questions, we need to explore the data a little bit to know what are we dealing with.</h2><a id=8></a>

<h2 align='center'><font color='#290066'>Products Info</font></h2>

_Provider/Company Name_

In [None]:
products_info['Provider/Company Name'].nunique()

> As we know, this number is too large for visualization. so let's plot the most frequent 20 companies.

In [None]:
# Flip it
order = products_info['Provider/Company Name'].value_counts().index[:20]

plt.figure(figsize=(15, 12))
show_countplot(products_info, x=None, y='Provider/Company Name', order=order, title='Provide/Company Name')

_Sector(s)_ <a id='sec'>

In [None]:
sectors = eng_product_merge['Sector(s)'].str.split(';',expand=True)

# removing unnecessary spaces
for col in sectors.columns:
    sectors[col] = sectors[col].str.strip()

# Counting the occurences of each sector in each column 
dic={}
for col in sectors.columns:
    dic[col]=(dict(sectors[col].value_counts()))

# Summing all together
sectors = pd.DataFrame(dic)
sectors_count = pd.DataFrame(sectors.fillna(0).values.sum(axis=1), index=sectors.index, columns=['Count'])
sectors_count

In [None]:
plt.pie(sectors_count.values.ravel(), labels=sectors_count.index, startangle = 90, autopct='%1.2f%%', 
                 counterclock = False, radius = 1.2, textprops={'fontsize': 14})

plt.title('Sector(s)', fontsize=15, color='brown');

In [None]:
del sectors, sectors_count

_Primary Essential Function_ <a id=pef>

Main Categories ['LC', 'SDO', 'CM']

In [None]:
# Let's get the primary eseential function main categories ['LC', 'SDO', 'CM'] alone 
eng_product_merge['PEF'] = eng_product_merge['Primary Essential Function'].str.split('-')
eng_product_merge['PEF'] = eng_product_merge['PEF'].apply(lambda x: x[0])
eng_product_merge['PEF'] = eng_product_merge['PEF'].str.strip()

In [None]:
pef_count = eng_product_merge['PEF'].value_counts()

plt.figure(figsize=(8, 8))
plt.pie(pef_count.values, labels=pef_count.index, startangle = 90, autopct='%1.2f%%', 
                 counterclock = False, radius = 1.2, textprops={'fontsize': 14});

plt.title('Primary Essential Function', fontsize=15, color='brown', y=1.1);

Let's explore the subcategories of each Primary Essential Function main categories.

In [None]:
eng_product_merge['sub_PEF'] = eng_product_merge['Primary Essential Function'].str.split('-')
eng_product_merge['sub_PEF'] = eng_product_merge['sub_PEF'].apply(lambda y: ' '.join(map(lambda x: x.strip(), y[1:])))

In [None]:
fig, axes = plt.subplots(nrows=3, ncols=1, figsize=(15, 25))

fig.suptitle('Subcategories of each Primary Essential Function main categories', fontsize=25, y=0.92, x=0.3)

for i, pef in enumerate(['LC', 'CM', 'SDO']):
    fig.sca(axes[i])
    
    temp = eng_product_merge[eng_product_merge['PEF']==pef].copy()
    order = temp['sub_PEF'].value_counts().index
    
    show_countplot(temp, x=None, y='sub_PEF', order=order, title=pef)

<h2 align='center'><font color='#290066'>Districts Info</font></h2>

In [None]:
fig, axes =  plt.subplots(6, 1, figsize=(25,65))
fig.suptitle('Districts Info', fontsize=25, y=0.9)
axes = axes.ravel()

# which axes to plot on
to_plot=[(None, 'state'),
         ('locale', None),
         ('pct_black/hispanic', None),
         ('pct_free/reduced', None),
         ('county_connections_ratio', None),
         ('pp_total_raw', None)]

for i,col in enumerate(districts_info.columns[1:]):
    fig.sca(axes[i])
    
    order=districts_info[col].value_counts().index
    show_countplot(districts_info, x=to_plot[i][0], y=to_plot[i][1], order=order, title=f'{col.title()} Percentage')

> From the previous plot we can see that "Country connections ratio" column is not important at all.That's why we will drop it

In [None]:
districts_info.drop('county_connections_ratio', axis=1, inplace=True)

<h2 align='center'><font color='#290066'>Engagement Data</font></h2>

Let's explore "engagement_index" and "perecentage access" over the whole year<a id='pe'></a>

In [None]:
all_eng_copy = all_eng.set_index('time')

fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(15, 12))

for i, col in enumerate(['engagement_index', 'pct_access']):
    fig.sca(axes[i])
    all_eng_copy[col].plot()
    plt.title(col, fontsize=15, color='brown')

plt.tight_layout()
del all_eng_copy

> We can see that their behaviour approximately the same over the year.Before finishing the exploration let's see the correlation between them.

In [None]:
corr = all_eng['engagement_index'].corr(all_eng['pct_access'])

print(f"Pearson's correlation between engagement index and perecentage access is: {colored(corr, 'green')}")

> As expected, Quite high correlation. Which makes us use only one of them in our next analysis (By choosing the most suitable one for answering our questions)

<h2 align='center'><font color='#290066'>Merged Data</font></h2>

Let's see, what are the most used products in 2020 ?

In [None]:
top_products = eng_product_merge.groupby('Product Name')['engagement_index'].mean().sort_values(ascending=False)
top_10=pd.DataFrame(top_products[:10])

plt.figure(figsize=(15,9))

sns.barplot(x=top_10['engagement_index'],y=top_10.index, color='#4d83de')
plt.title('Top 10 Products', color='brown', fontsize=15)
plt.xlabel('Engagment Index', fontsize=12)
plt.ylabel('Product Name', fontsize=12)

del top_products

<h2 align='center'><font color='blue'>Let's Answer the proposed Questions first then we will dive deeper in our analysis</font></h2><a id=9></a>

Actually we can answer the first proposed question **"What is the picture of digital connectivity and engagement in 2020?"** by the [previous exploration](#pe).from the engagement index plot we can see that the engagement with e-learning platform began low then was increasing with time then there was a drop (between July and September) then the engagement began to increase again.Two questions should be answered:<br>
&nbsp;&nbsp;&nbsp;&nbsp;(1) what caused that drop? <br>
&nbsp;&nbsp;&nbsp;&nbsp;(2) why the engagement began to rise again? <br>

**For the first question, May be due to:**
- Most of the american universities take the holiday between July and September.
- According to [WHO](https://www.who.int/emergencies/diseases/novel-coronavirus-2019/interactive-timeline?gclid=CjwKCAjwyvaJBhBpEiwA8d38vF9MS1vRVTg1UPNPzmP8DNAgsBywY6M_Q3Q9giUSd5xtato7Y79zNBoC2mIQAvD_BwE#!) the number of corona cases passed 200000 during that period which ,in turn, attracted the attention to "what we should do and how can we protect ourselves". May all that distracted the students from learning. 

**For the second one:**
- If the cause was the holiday. So obviously we can say that the rise was due to the begging of a new semester.
- But if the cause was due to the distraction caused by the number of deaths So may people began used to it.

Let's dive deeper to **see the details of this picture :)**

In [None]:
# number of products we have
eng_product_merge['Product Name'].nunique()

> 361 is a very large number for visualization. so we will take 20 random chosen products to tell us the overall picture.

In [None]:
np.random.seed(42)      # to have a consistent output every time we run the code

sampled_products = np.random.choice(eng_product_merge['Product Name'].unique(), size=20, replace=False)
sampled_products = eng_product_merge[eng_product_merge['Product Name'].isin(sampled_products)].copy()

In [None]:
show_plots(sampled_products, 'engagement_index', title='Engagement Index for 20 random sampled products')

_From the previous exploration we knew that there is no need for investigating the percentage access but let's see if it will add something_

In [None]:
show_plots(sampled_products, 'pct_access',title='Percentage Access for 20 random sampled products')

> We can observe that **the two plots are very similar**, so in the next plots, We will use `engagement index` because the percentage access is related to a certain district but we want now to make a general analysis for product usage not for specific districts.

> **We can observe that the engagement index behaviour changes from product to product the:<br>**
&nbsp;&nbsp;&nbsp;&nbsp; ○ Some products, their usage after the mentioned drop couldn't reach its original state before that drop like:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; • CoolMath Games <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; • CK-12 <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; • Adobe Character Animator <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; • Microsoft Outlook <br><br>
&nbsp;&nbsp;&nbsp;&nbsp; ○ Some products, their usage after the mentioned drop approximately reached its original state before that drop like:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; • Diax <br><br>
&nbsp;&nbsp;&nbsp;&nbsp; ○ Other products, their usage after the mentioned drop surpassed its original state before that drop like:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; • Remind <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; • Ellevation <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; • ZOOM Cloud Meetings <br>

<font color='#1B5EE9'>So now let's dig deeper and see what we will get.</font>

In [None]:
# For memory usage
del sampled_products

**_Let's see the impact of COVID-19 on "LC" Websites_**

In [None]:
trunc_df = eng_product_merge[eng_product_merge['PEF']=='LC'].copy()

#Let's take the 20 most used products for "LC" category and analyse their usage along the year
sampled_products = trunc_df['Product Name'].value_counts().index[:20]
sampled_products = trunc_df[trunc_df['Product Name'].isin(sampled_products)].copy()
    
show_plots(sampled_products, 'engagement_index', title='LC: Engagement Index for 20 random sampled products')

# for memory usage
del sampled_products, trunc_df

**_Let's see the impact of COVID-19 on "CM" Websites_**

In [None]:
trunc_df = eng_product_merge[eng_product_merge['PEF']=='CM'].copy()

#Let's take the 20 most used products for "LC" category and analyse their usage along the year
sampled_products = trunc_df['Product Name'].value_counts().index[:20]
sampled_products = trunc_df[trunc_df['Product Name'].isin(sampled_products)].copy()
    
show_plots(sampled_products, 'engagement_index', title='CM: Engagement Index for 20 random sampled products')

# for memory usage
del sampled_products, trunc_df

**_Let's see the impact of COVID-19 on "SDO" Websites_**

In [None]:
trunc_df = eng_product_merge[eng_product_merge['PEF']=='SDO'].copy()

#Let's take the 20 most used products for "LC" category and analyse their usage along the year
sampled_products = trunc_df['Product Name'].value_counts().index[:20]
sampled_products = trunc_df[trunc_df['Product Name'].isin(sampled_products)].copy()
    
show_plots(sampled_products, 'engagement_index', title='SDO: Engagement Index for 20 random sampled products')

# for memory usage
del sampled_products

**_Let's see if the impact of COVID-19 differs between "LC", "CM",and "SDO" Websites_**

In [None]:
colors = ['red', 'blue', 'green']

fig = plt.figure(figsize=(15,20))
fig.suptitle('Impact of COVID-19 on ("LC", "CM", "SDO") websites', fontsize=30, weight='bold', y=0.93)

for i, pef in enumerate(['LC', 'CM', 'SDO']):
    trunc_df = eng_product_merge[eng_product_merge['PEF']==pef].copy()
    
    most_used = trunc_df['Product Name'].value_counts().index[:20]
    
    plt.subplot(3, 1, i+1)
    for product in most_used:
        product_list = trunc_df[trunc_df['Product Name']== product].groupby('month')['pct_access'].mean()
        plt.plot(product_list.index, product_list.values, color=colors[i]);
    
    plt.title(pef, fontsize=20, color='brown')
    plt.xticks(ticks=range(1,13), labels=range(1, 13));

> As we can see, it's hard to plot all these lines in one plot because of the wide range between the values in `engagement_index` and of course `pct_access` so we needed to take another measure of the usage of e-learning platforms and online learning.That's why we've used **the number of e-learning products used during each month** as a measure of how active the e-learning was during that month.

In [None]:
colors = ['red', 'blue', 'green']

fig = plt.figure(figsize=(15,20))
fig.suptitle('Impact of COVID-19 on ("LC", "CM", "SDO") websites', fontsize=30, weight='bold', y=0.93)

for i, pef in enumerate(['LC', 'CM', 'SDO']):
    trunc_df = eng_product_merge[eng_product_merge['PEF']==pef].copy()
    
    most_used = trunc_df['Product Name'].value_counts().index[:20]
    
    plt.subplot(3, 1, i+1)
    for product in most_used:
        product_list = trunc_df[trunc_df['Product Name']== product].groupby('month')['lp_id'].count()
        plt.plot(product_list.index, product_list.values, color=colors[i]);
    
    plt.title(pef, fontsize=20, color='brown')
    plt.xticks(ticks=range(1,13), labels=range(1, 13));

Let's plot the average line for each category

In [None]:
colors = ['red', 'blue', 'green']

fig = plt.figure(figsize=(12,7))

for i, pef in enumerate(['LC', 'CM', 'SDO']):
    trunc_df = eng_product_merge[eng_product_merge['PEF']==pef].copy()

    most_used = trunc_df['Product Name'].value_counts().index[:20]
    
    product_month = {}
    for product in most_used:
        product_month[product] = trunc_df[trunc_df['Product Name']== product].groupby('month')['lp_id'].count()
        product_month[product] = check(product_month[product])
    
    product_month = pd.DataFrame(product_month)
        
    average_plot = product_month.mean(axis=1)
    
    plt.plot(average_plot.index, average_plot.values, color=colors[i], label=pef);
    
plt.title('Impact of COVID-19 on ("LC", "CM", "SDO") websites (Averages)', fontsize=20, color='brown')
plt.legend();

# for memory usage
del product_month

Regardless of this drop we've talked about before. It seems that approximately:
- The usage of `LC` webistes **returns** to its original state before that drop
- The usage of `CM` websites **Increased**
- The usage of `SDO` webistes **decreased**

### This analysis has been done on the whole analysis Let's investigate each state sperately and see if the that will lead us to something.

Combining districts for each state together

In [None]:
# so that our results always be the same
np.random.seed(42)

states = {} 
states_to_take = np.random.choice(districts_info['state'].unique(), size=10, replace=False)

for state in states_to_take:
    districts = districts_info[districts_info['state']==state].district_id.values
    
    to_merge = []
    for district in districts:
        to_merge.append(pd.read_csv(f'../input/learnplatform-covid19-impact-on-digital-learning/engagement_data/{district}.csv'))
    
    states[state] = pd.concat(to_merge)

In [None]:
for state in states:
    states[state] = pd.merge(states[state], products_info, left_on='lp_id', right_on='LP ID', how='inner').drop('LP ID', axis=1)
    
    # doing the same processes
    states[state]['time'] = pd.to_datetime(states[state]['time'])
    states[state]['month'] = states[state]['time'].dt.month
    
    states[state]['PEF'] = states[state]['Primary Essential Function'].str.split('-')
    states[state]['PEF'] = states[state]['PEF'].apply(lambda x: x[0])
    states[state]['PEF'] = states[state]['PEF'].str.strip()

In [None]:
fig, axes = plt.subplots(nrows=10, ncols=3, figsize=(25,65))
fig.suptitle('Impact of COVID-19 on ("LC", "CM", "SDO") websites (from each state perspective)', fontsize=30, weight='bold', y=0.91)

colors = ['red', 'blue', 'green']
for i, state in enumerate(states):
    for j, pef in enumerate(['LC', 'CM', 'SDO']):
        data = states[state]
        trunc_df = data[data['PEF']==pef].copy()

        most_used = trunc_df['Product Name'].value_counts().index[:20]

        for product in most_used:
            product_list = trunc_df[trunc_df['Product Name']== product].groupby('month')['lp_id'].count()
            axes[i, j].plot(product_list.index, product_list.values, color=colors[j], alpha=0.5);

        axes[i, j].set_title(f'{state}\n({pef})', fontsize=20, color=colors[j])
        axes[i, j].set_xticks(range(1, 13));

Let's plot the average lines instead of all that mess.

In [None]:
fig, axes = plt.subplots(nrows=10, ncols=1, figsize=(25,65))
fig.suptitle('Impact of COVID-19 on ("LC", "CM", "SDO") websites (Averages-from each state perspective)', fontsize=30, weight='bold', y=0.91)

colors = ['red', 'blue', 'green']
for i, state in enumerate(states):
    for j, pef in enumerate(['LC', 'CM', 'SDO']):
        data = states[state]
        trunc_df = data[data['PEF']==pef].copy()

        most_used = trunc_df['Product Name'].value_counts().index[:20]
        
        product_month = {}
        for product in most_used:
            product_month[product] = trunc_df[trunc_df['Product Name']== product].groupby('month')['lp_id'].count()
            product_month[product] = check(product_month[product])

        product_month = pd.DataFrame(product_month)

        average_plot = product_month.mean(axis=1)
    
        axes[i].plot(average_plot.index, average_plot.values, color=colors[j], label=pef);

        axes[i].set_title(state, fontsize=20, color=colors[j])
        axes[i].legend()

From the plots above, we can observe that the usage of learning platforms has **three** common behaviour ,_after the drop we've taked before about_,:
* <font color='#862d59'>The usage has decreased from its original value ( Before the drop happened ):</font>
    - New York
    - WisConsin
    - Minnesota
    - Washington
* <font color='#862d59'>The usage returns to its original value ( Before the drop happened ):</font>
    - Utah
    - New Jersey
    - California
* <font color='#862d59'>The usage has increased from its original value ( Before the drop happened ):</font>
    - Illinois
    - Texas
    
> The Last state `Indiana` was a little strange because the CM websites curve went up after the "drop" period but the other two curves went down.

In [None]:
# for memory usage
del states, product_month

Let's dig deeper and see if we can know the reason for the previous phenomena

In [None]:
to_investigate = [['New York', 'Wisconsin', 'Minnesota', 'Washington'],
                  ['Utah', 'New Jersey', 'California'],  
                  ['Illinois', 'Texas']]             

titles = ['Decreasing the usage from its original value (before the drop)',
          'Returning the usage to its original value (before the drop)',
          'Increasing the usage from its original value (before the drop)']

**From "Locale" Perspective**

In [None]:
order = ['City', 'Town', 'Suburb', 'Rural']

investigate('locale', (15,20), 'From "Locale" Perspective', order=order)

**We can see that:** <br>
* **<font color='#336699'>The states where the learning platforms usage has decreased from its original value ( Before the drop ) tends to have**:</font>** <br>
     - Less towns (0% compared to 14% and 5%) <br>
     - More Rural areas (28% compared to 7% and 15%) <br>
        <hr>
* **<font color='#336699'>The states where the learning platforms usage has returned to its original value ( Before the drop ) tends to have: </font>**<br>
     - More towns (14% compared to 0% and 5%). <br>
     - Less Rural areas (7% compared to 28% and 15%)  <br>
        <hr>
* **<font color='#336699'>The states where the learning platforms usage has increased from its original value ( Before the drop ) tends to have**:</font>** <br>
     - More Suburb (75% compared to 53% and 50%) <br>
     - Less Cities (5% compared to 26% and 22%)   <br>

**From "pct_black/hispanic" Perspective**<a id='bh'>

In [None]:
order = ['[0, 0.2[', '[0.2, 0.4[', '[0.4, 0.6[', '[0.6, 0.8[', '[0.8, 1[']
    
investigate('pct_black/hispanic', (15,20), 'From "pct_black/hispanic" Perspective', order=order)

**We can see that:**
* **<font color='#336699'>The states where the learning platforms usage has decreased from its original value ( Before the drop ) tends to have:</font>**
    - A relatively low percentage of black/Hispanic (About 5.56% of their districts have these students with a percentage of 60% or higher).
<hr>
* **<font color='#336699'>The states where the learning platforms usage has returned to its original value ( Before the drop ) tends to have:</font>**
    - A a bit higher percentage of black/Hispanic (About 9.31% of their districts have these students with a percentage of 60% or higher).
<hr>
* **<font color='#336699'>The states where the learning platforms usage has increased from its original value ( Before the drop ) tends to have:</font>**
    - A relatively high percentage of black/Hispanic (About 25% of their districts have these students with a percentage of 60% or higher)
    
<font color='red'>_Note:_
> The states where the learning platforms usage has decreased from its original value ( Before the drop ) have the least chance of having a district with 60% or higher of black/hispanic. which can indicate that may be the policies in that states (how they are treated by the law) are not the best or even some bad deeds from the white people there like: bullying and racism. All that we not give these students the right environment for learning.let us explain a bit more, If they have patients among their families or even themselves, these polices or bad deeds wouldn't let them be treated equally with white patients.<br>

**From "pct_free/reduced" Perspective**<a id='fr'>

In [None]:
order = ['[0, 0.2[', '[0.2, 0.4[', '[0.4, 0.6[', '[0.6, 0.8[', '[0.8, 1[']

investigate('pct_free/reduced', (15,20), 'From "pct_free/reduced" Perspective', order=order)

**We can see that:**
* **<font color='#336699'>The states where the learning platforms usage has decreased from its original value ( Before the drop ) tends to have:</font>**
    - A relatively high percentage of students eligible for free or reduced-price lunch (About 83.33% of their districts have these students with a percentage of 20% or higher).
<hr>
* **<font color='#336699'>The states where the learning platforms usage has returned to its original value ( Before the drop ) tends to have:</font>**
    - A relatively moderate percentage of students eligible for free or reduced-price lunch (About 67.44% of their districts have these students with a percentage of 20% or higher).
<hr>
* **<font color='#336699'>The states where the learning platforms usage has increased from its original value ( Before the drop ) tends to have:</font>**
    - A bit lower percentage of students eligible for free or reduced-price lunch (About 60% of their districts have these students with a percentage of 20% or higher).
    
<font color='red'>_Note:_
> Having a large percentage of the students eligible for free or reduced-price lunches leads to lower usage of learning platforms. Maybe the reason is the cost of having good internet access (As during COVID-19 the load on the companies that provide internet has increased. Which made the service that has a humble cost isn't good anymore)

**From "pp_total_raw" Perspective**<a id='tr'>

In [None]:
order = ['[4000, 6000[', '[6000, 8000[', '[8000, 10000[', '[10000, 12000[', '[12000, 14000[', '[14000, 16000[', 
         '[16000, 18000[', '[18000, 20000[']

investigate('pp_total_raw', (15,20), 'From "pp_total_raw" Perspective', order=order)

**We can see that:**
* **<font color='#336699'>The states where the learning platforms usage has decreased from its original value ( Before the drop ) tends to have:</font>**
    - Most of the expenditure lies between 14000-16000\$
<hr>
* **<font color='#336699'>The states where the learning platforms usage has returned to its original value ( Before the drop ) tends to have:</font>**
    - Most of the expenditure lies between 6000-10000\$
<hr>
* **<font color='#336699'>The states where the learning platforms usage has increased from its original value ( Before the drop ) tends to have:</font>**
    - Most of the expenditure lies between 12000-14000\$
    
<font color='red'>_Note:_
> It's not a must to increase the fees of a school to have a good student (that follows his lessons despite the circumstances) and of course we shouldn't lower it very much to provide a good education services.

### We've seen the impact of COVID-19 on different "PEF" and different states. What about Educational Sectors.

_Investigating the Impact of COVID-19 on each PEF for each sector separately_

In [None]:
sec_df = eng_product_merge.copy()
sec_df['Sec'] = sec_df['Sector(s)'].str.split(';')
sec_df = sec_df.explode('Sec')
sec_df['Sec'] = sec_df['Sec'].str.strip()

In [None]:
fig = plt.figure(figsize=(17, 40))
fig.suptitle('Impact of COVID-19 on ("LC", "CM", "SDO") websites (Averages-from each sector perspective)', fontsize=20, weight='bold', y=0.91)
for i, sector in enumerate(sec_df['Sec'].unique()):
    plt.subplot(6, 1, i+1)
    data = sec_df[sec_df['Sec']==sector]
    
    for pef in ['CM','SDO','LC']:
        result = data[data['PEF']==pef].groupby('month')['lp_id'].count()
        plt.plot(result.index, result.values, label=pef)
        plt.title(sector, fontsize=12, color='brown')
    plt.legend()

> As we can see the Effect of COVID-19 is almost the same for each Educational Sector.

In [None]:
# for memory usage
del sec_df

Let's investigate the usage of 6 random chosen sub categories of the three main categories. <a id=sub>

In [None]:
for pef in ['LC', 'CM', 'SDO']:
    temp = eng_product_merge[eng_product_merge['PEF']==pef].copy()

    np.random.seed(42)
    random_chosen = np.random.choice(temp['sub_PEF'].unique(), size=6, replace=False)

    fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(20, 8))

    fig.suptitle(pef, fontsize=15, y=0.96)

    axes = axes.ravel()
    for i, spef in enumerate(random_chosen):
        fig.sca(axes[i])

        sub_pef = temp[temp['sub_PEF']== spef].groupby('month')['lp_id'].count()
        sub_pef = check(sub_pef)

        plt.plot(sub_pef.index, sub_pef.values)
        plt.title(spef, color='brown')

    del temp
    del sub_pef


## <font color='#660066'> LC </font>
>**Most of the subcategories increased from their original values after the drop or at least returns to their original state like:** <br>
    - Digital Learning Platform.    
    - Sites, Resources & Reference Encyclopedia.     
    - Study Tools.
    
<hr>

>**So from the above plot we can conclude that:** <br>
    <li> Most of the subcategories,if not all, returned to their original values or even surpassed it (the peak is most likely to be in October)    
    <li> Before the drop`The career planning and job search` subcategory was decreasing (may be that is because of the spread of COVID-19 and the Emergency Declaration at that time) But after that drop it went up rapidly and surpassed its previous state (And of course that is because people became open to the idea of working from home and how they can keep working despite the circumstances).


## <font color='#660066'> CM </font>
>**As expected most of the subcategories increased from their original values after the drop like:** <br>
    - Classroom Engagement & Instruction Assessment & Classroom Response.    
    - Teacher Resources Professional Learning.     
    - Virtual Classroom Video Conferencing & Screen Sharing.
    
<hr>

>**So from the above plot we can conclude that:** <br>
    <li> Also here most of the subcategories returned to their original values or even surpassed it (the peak is most likely to be in October)  <font color='red'> except </font> `Classroom Engagement & Instruction Communicaton & Messaging` which is kind of weird actually as `Classroom Engagement & Instruction Assessment & Classroom Response` or `Classroom engagement & Instruction Classroom Management` surpassed their orignal state before the drop which make us wonder why such a thing happend.If we think about it alittle deeper we will find that this is reasonable as `Classroom Engagement & Instruction Communicaton & Messaging` only providing one additional feature instead of Classroom engagment which is Communication and messaging but as we know it's not an important feature anymore we can do that by Facebook or Whatsapp.On the other hand, the `Classroom Engagement & Instruction Assessment & Classroom Response` or `Classroom engagement & Instruction Classroom Management` provides new features like Instruction assessment ( in the first one ) or Instruction Classroom management ( in the second one ). <br><br>
    <li> The spread of virtual classroom video conferencing & screen sharing start increasing in March and it's still increasing and that's because of online learning from home. This is the way to communicate between the lecturer and their students

## <font color='#660066'> SDO </font>
>**Also here most of the subcategories increased from their original values after the drop like:** <br>
    - Data Analytics & Reporting Student Information Systems (SIS).    
    - Environmental, Health & Safety (EHS) Compilance.     
    - School Management Software SSO.
    
<hr>

>**So from the above plot we can conclude that:** <br>
    <li> Also here most of the subcategories returned to their original values or even surpassed it (the peak is most likely to be in October) <font color='red'> except </font> `Learning Management Systems (LMS)`.By diving deeper into that subcategory we've found that its products provide features that are used in many other websites not only that but the other products provide more than that. <br>
    <li> Products like "SafeSchool", "Clever",and "Infinite Campus" have got more attention because of COVID-19. <br>

<h2 align='center'><font color='#290066'>5. Conclusion</font></h2><a id=10></a>

○ Products that are most used in 2020 are for <code>the PreK-12 sector</code>, and that’s good because if students in this sector adapted themselves to use online tools and new technologies, that will help them in the future. >> [Reference](#sec)<br>

○ Most of the products that are used fall under the service of LC (Learning & Curriculum). >> [Reference](#pef)<br>

○ For LC category: >> [Reference](#sub)
> <li> Most of the subcategories,if not all, returned to their original values or even surpassed it (the peak is most likely to be in October)    
 <li> Before the drop<code>The career planning and job search</code> subcategory was decreasing (may be that is because of <code>the spread of COVID-19 and the Emergency Declaration</code> at that time) But after that drop it went up rapidly and surpassed its previous state (And of course that is because people became open to the idea of working from home and how they can keep working despite the circumstances).

○ For CM category: >> [Reference](#sub)
> <li> Also here most of the subcategories returned to their original values or even surpassed it (the peak is most likely to be in October)  <code> except </code> <code>Classroom Engagement & Instruction Communicaton & Messaging</code> which is kind of weird actually as <code>Classroom Engagement & Instruction Assessment & Classroom Response</code> or <code>Classroom engagement & Instruction Classroom Management</code> surpassed their orignal state before the drop which make us wonder why such a thing happend.If we think about it alittle deeper we will find that this is reasonable as <code>Classroom Engagement & Instruction Communicaton & Messaging</code> only providing one additional feature instead of Classroom engagment which is Communication and messaging but as we know it's not an important feature anymore we can do that by Facebook or Whatsapp.On the other hand, the <code>Classroom Engagement & Instruction Assessment & Classroom Response</code> or <code>Classroom engagement & Instruction Classroom Management</code> provides new features like Instruction assessment ( in the first one ) or Instruction Classroom management ( in the second one ). <br><br>
    <li> The spread of virtual classroom video conferencing & screen sharing start increasing in March and it's still increasing and that's because of online learning from home. This is the way to communicate between the lecturer and their students.
        
○ For SDO category: >> [Reference](#sub)
> <li> Also here most of the subcategories returned to their original values or even surpassed it (the peak is most likely to be in October) <font color='red'> except </font> <code>Learning Management Systems (LMS)</code>.By diving deeper into that subcategory we've found that its products provide features that are used in many other websites not only that but the other products provide more than that. <br>
   <li> Products like "SafeSchool", "Clever",and "Infinite Campus" have got more attention because of COVID-19. <br>
       
      
○ **<font color='#336699'>The states where the learning platforms usage has decreased from its original value ( Before the drop ) tends to have the following features:</font>** <br>
&nbsp;&nbsp;&nbsp;&nbsp;• A relatively low percentage of black/Hispanic (About <font color='red'>5.56%</font> of their districts have these students with a percentage of <font color='red'>60%</font> or higher). >> [Reference](#bh)<br>
&nbsp;&nbsp;&nbsp;&nbsp;• A relatively high percentage of students eligible for free or reduced-price lunch (About <font color='red'>83.33%</font> of their districts have these students with a percentage of <font color='red'>20%</font> or higher). >> [Reference](#fr)             
&nbsp;&nbsp;&nbsp;&nbsp;• Most of the expenditure lies between <font color='red'>14000-16000\$</font>. >>[Reference](#tr)<br>

○ **<font color='#336699'>The states where the learning platforms usage has returned to its original value ( Before the drop ) tends to have the following features:</font>** <br>
&nbsp;&nbsp;&nbsp;&nbsp;• A a bit higher percentage of black/Hispanic (About <font color='red'>9.31%</font> of their districts have these students with a percentage of <font color='red'>60%</font> or higher). >> [Reference](#bh)<br>
&nbsp;&nbsp;&nbsp;&nbsp;• A relatively moderate percentage of students eligible for free or reduced-price lunch (About <font color='red'>67.44%</font> of their districts have these students with a percentage of <font color='red'>20%</font> or higher). >>[Reference](#fr)<br>
&nbsp;&nbsp;&nbsp;&nbsp;• Most of the expenditure lies between <font color='red'>6000-10000\$</font>. >>[Reference](#tr)<br>

○ **<font color='#336699'>The states where the learning platforms usage has increased from its original value ( Before the drop ) tends to have the following features:</font>**<br>
&nbsp;&nbsp;&nbsp;&nbsp;• A relatively high percentage of black/Hispanic (About <font color='red'>25%</font> of their districts have these students with a percentage of <font color='red'>60%</font> or higher). >> [Reference](#bh)<br>
&nbsp;&nbsp;&nbsp;&nbsp;• A bit lower percentage of students eligible for free or reduced-price lunch (About <font color='red'>60%</font> of their districts have these students with a percentage of <font color='red'>20%</font> or higher). >>[Reference](#fr)<br>
&nbsp;&nbsp;&nbsp;&nbsp;• Most of the expenditure lies between <font color='red'>12000-14000\$</font> >>[Reference](#tr)<br>

**From that we can conclude the following:**

> The states where the learning platforms usage has decreased from its original value ( Before the drop ) have the least chance of having a district with 60% or higher of black/hispanic >>[Reference](#bh)<< .which can indicate that may be the policies in that states (how they are treated by the law) are not the best or even some bad deeds from the white people there like: <font color='red'>bullying and racism</font>. All that we not give these students the right environment for learning.let us explain a bit more, If they have patients among their families or even themselves, these polices or bad deeds wouldn't let them be treated equally with white patients.<br>

> Having a large percentage of the students eligible for free or reduced-price lunches leads to lower usage of learning platforms. Maybe the reason is the cost of having good internet access (As during COVID-19 the load on the companies that provide internet has increased. Which made the service that has a humble cost isn't good anymore). >>[Reference](#fr)<br>


> <font color='red'>It's not a must to increase the fees</font> of a school to have a good student (that follows his lessons despite the circumstances) and of course we shouldn't lower it very much to provide a good education services. >>[Reference](#tr)<br>

**<font color='#4d2600'>Suggested Solutions:</font>**

<q> Human rights should be granted for every human,not just a word everybody say to seek a position.We don't know exactly how, we don't have such an experience in these kinda things but more efforts should be exerted to guarantee that every human had his own rights.</q><br>

> Poverity should not be an obstacle in the road of learning. This issue may be solved by:<br>
&nbsp;&nbsp;&nbsp;&nbsp;• The material of each week (Lectures and assignment) can be provide in DVDs in the begging of each week so that it the student can't afford a good internet connect he can buy this DVD.<br>
&nbsp;&nbsp;&nbsp;&nbsp;• If these people have something to identify them ( like a card or something ) we can provide them with a place with a good internet connection ( in the school or even in every district )<br>

> A limit should be put to school expenditure so that the fees doesn't go up. May be high fees makes it hard to a family to afford a good internet connection. Or may be high fees makes the school for rich people only and that will upset poor people which, in turn, makes them wants to work instead of leaning to make more money.<br>