![CAT_logo2.png](attachment:CAT_logo2.png)

# <span style= 'color: #7A3803'> Curriculum Engagement and Usage Analysis
- <span style= 'color: #EC9706'> By CAT Team: Miatta Sinayoko, Martin Reye and Annie Carter
- <span style= 'color: #EC9706'> Sourced by CodeUp, LLC

# <span style ='color:#7A3803'><center> <b>EXECUTIVE SUMMARY</b></center>

  

## <span style ='color:#D16002'> Project Background and Description
In preparation for the upcoming Thursday afternoon board meeting, we are currently analyzing the usage patterns of our educational curriculum using anomaly detection techniques. Our aim is to identify any inconsistencies or issues and provide insights into cohort engagement, detect irregular activities, and uncover trends in post-graduation knowledge retention. This initiative is designed to provide you with comprehensive data to facilitate informed discussions and decision-making during the board meeting.

## <span style ='color:#D16002'>Project Goal

The primary objectives are:
* Analyze lesson popularity across programs, cohort engagement differences, and low engagement student profiles to enhance curriculum effectiveness.
* Investigate anomalies, security breaches, and cross-curriculum access, while assessing post-graduation knowledge impact.
* Uncover insights from underutilized lessons and unexpected trends, providing comprehensive data for informed decision-making.

## <span style ='color:#D16002'>Initial Questions
1. Identify the lesson that consistently garners the highest traffic across cohorts within each program?
2. Is there a particular cohort that has demonstrated significantly higher engagement with a specific lesson than other cohorts, warranting further investigation?
3. Identify instances of active students who exhibit minimal interaction with the curriculum? If so,provide insights into these students' behavior and engagement patterns?
4. Detect any potentially unauthorized access or suspicious activities, including unusual access patterns or indications of web scraping? Are there any indications of suspicious IP addresses?
5. Has the access for students and alumni to both curriculums (web development to data science, data science to web development) been disabled as intended at some point in 2019? Confirm if this change was implemented and if so, when?
6. Outline the topics that graduates continue to reference beyond their graduation and into their professional roles for both programs?
7. Which lessons have recorded the lowest levels of access?
8. Is there any additional information or insights that you believe would be pertinent and client needs to be aware of?

## <span style ='color:#D16002'>Data Dictionary
The data was acquired from CodeUP, LLC's 'curricular_logs' dataset, initially containing 847,330 rows and 7 columns. The team distributed tasks to effectively utilize and manipulate the original dataframe in order to address the specified questions.

|    Original Column Name     |   Target    |       Data Type          |       Definition              |
|-----------------------------|-------------|--------------------------|------------------------------ |
|        Various              |  Various    |      Various             | Target dependent on Question  |
                                               


|    Original Columns Name    |   Feature    |       Data Type         |     Definition                |
|-----------------------------|--------------|------------------------ |------------------------------ |
|date                         |date          | 847329 non-null Datetime| Date of access                |
|time                         |time          | 847330 non-null  object | Time                          |
|l.path                       |lesson        | 847330 non-null  object | Lesson path                   | 
|user_ID                      |user_ID       | 847330 non-null  int64  | user identification           |    
|c.name                       |cohort        | 847330 non-null  object |Cohort name (e.g. Darden)/Staff|
|program_ID                   |program       | 847330 non-null  object |Program name(e.g. Data Science)|
|ip                           |ip            | 847330 non-null  object | Used for feature engineering  |
|start_date                   |start_date    | 847330 non-null  object | Program Start Date            |
|end_date                     |end_date      | 847330 non-null  object | Graduation Date               |
    

# <b>_____________________________________________________________________________________________________________________ </b>


In [9]:
# Transform Libraries
import pandas as pd
import numpy as np
import os

#CAT Team Libraries
import wrangle as w
from env import user, hostname, password

# Visualization Libraries
import urllib.parse
import gzip
import seaborn as sns
import matplotlib.pyplot as plt

# Misc.  Libraries 
from sqlalchemy import create_engine
from io import BytesIO
from io import StringIO

# <span style ='color:#7A3803'> ACQUIRE & PREPARE

In [10]:
w.get_connection('curriculum_logs', user=env.user, hostname=env.hostname, password=env.password)

NameError: name 'env' is not defined

In [2]:
df_logs = w.get_log_data()

AttributeError: module 'wrangle' has no attribute 'get_log_data'

# <b>_____________________________________________________________________________________________________________________ </b>


# <span style ='color:#7A3803'> CAT TEAM : EXPLORATION & ANALYSIS

### <span style ='color:#D16002'>1. Identify the lesson that consistently garners the highest traffic across cohorts within each program?

###  <span style ='color:#D16002'> QUESTION 1 Findings
#### <span style= 'color: #EC9706'> ..

###  <span style ='color:#D16002'>2. Is there a particular cohort that has demonstrated significantly higher engagement with a specific lesson than other cohorts, warranting further investigation?

In [None]:
w.question2_1(logs_df)  

In [None]:
w.question2_2(logs_df)

In [None]:
w.question2_graph(logs_df)

###  <span style ='color:#D16002'> QUESTION 2 Findings
#### <span style= 'color: #EC9706'>Q2 Findings:
#### <span style= 'color: #EC9706'> **Methodology and Insights:**Initial Identification: Anomalies were spotted by comparing the engagement counts of each cohort with the '/' lesson. The homepage marked as "/" received the highest number of accesses. Remarkably, the "Darden" cohort, aside from "Staff," exhibited a substantial engagement with this lesson, accessing it a notable 2980 times. This count surpassed the single access recorded for the Denali, Apollo, and Everglades cohorts.
- The difference in engagement between cohort 'Darden' and the other cohorts for the lesson '/' indicates a potential anomaly that requires further investigation.
.'
#### <span style= 'color: #EC9706'>**Recommendation:** Additional qualitative research will delve into factors like curriculum content, teaching methods, or cohort dynamics to understand the anomaly's underlying causes.

### <span style ='color:#D16002'>3. Identify instances of active students who exhibit minimal interaction with the curriculum? If so,provide insights into these students' behavior and engagement patterns?

###  <span style ='color:#D16002'> QUESTION 3 Findings
#### <span style= 'color: #EC9706'> **Methodology and Insights:** Through our analysis, we identified active students with minimal curriculum interaction, defined by engaging with fewer than three unique lessons. This low engagement hints at potential challenges or missed opportunities in their learning journey.

- Limited Lesson Diversity: These students often interacted with a small range of lessons, suggesting a lack of exploration.
- Addressing Challenges: Challenges such as time constraints or difficulties with material may contribute to low engagement.
- Early Intervention: Detecting low engagement early can enable timely support and tailored interventions.

#### <span style= 'color: #EC9706'>**Recommendation:** To enhance engagement and support learning, we suggest implementing proactive measures, such as personalized assistance and additional resources, for students displaying minimal curriculum interaction.

### <span style= 'color: #EC9706'>4. Detect any potentially unauthorized access or suspicious activities, including unusual access patterns or indications of web scraping? Are there any indications of suspicious IP addresses?there any suspicious IP addresses?

###  <span style ='color:#D16002'> QUESTION 4 Findings
#### <span style= 'color: #EC9706'> ..

### <span style= 'color: #EC9706'>5. Has the access for students and alumni to both curriculums (web development to data science, data science to web development) been disabled as intended at some point in 2019? Confirm if this change was implemented and if so, when?

In [None]:
#Plots the average monthly logs for Data Science students.
w.plot_monthly_avg_ds_logs()

In [None]:
# Plots the average monthly logs for Web Development students.
w.plot_monthly_avg_wd_logs

###  <span style ='color:#D16002'> QUESTION 5 Findings
#### <span style= 'color: #EC9706'> ..

### <span style= 'color: #EC9706'>6. Outline the topics that graduates continue to reference beyond their graduation and into their professional roles for both programs?

In [None]:
#Plots the average monthly logs for Web Development students.
w.plot_monthly_avg_wd_logs()

In [None]:
w.plot_top_wb_alumni_lessons()

###  <span style ='color:#D16002'> QUESTION 6 Findings
#### <span style= 'color: #EC9706'> ..

### <span style= 'color: #EC9706'>7. Which lessons have recorded the lowest levels of access?

###  <span style ='color:#D16002'> QUESTION 8 Findings
#### <span style= 'color: #EC9706'> ..

### <span style= 'color: #EC9706'>8. Is there any additional information or insights that you believe would be pertinent and client needs to be aware of?


###  <span style ='color:#D16002'> QUESTION 8 Findings
#### <span style= 'color: #EC9706'> ..