# Analysis for Continuous Improvement

Author Name: Marlette Jessa C. Dakiwas

9-digit PID: 730474304

Continuous Improvement embraces a belief there is _always room to make things better_. It is a mindset and process we value and practice in this course. In this assignment, you are able to practice continuous improvement and contribute to the design ideas of the course.

## Brainstorming Ideas

Reflect on your personal experiences and observations in COMP110 and **brainstorm modifications to the course that _create value_ beyond its current design**. When brainstorming, try not to be critical of the ideas you come up with regarding scale, stakeholders impacted, or for any other reasons. In the markdown cell below, brainstorm 3 to 5 ideas you think would create value for you.

Each brainstormed idea should state a. the suggested change or addition, b. what the expected value created, and c. which specific stakeholders would benefit.  If helpful, expand on the following template "The course should (state idea here) because it will (state value created here) for (insert stakeholders here)."

Example A: "The course should use only examples from psychology experiments because it will be more relevant for students who are psychology majors."

Example B: "The course should not have post-lesson questions because they are not useful for most students in the class."

### Part 1. Creative Ideation

1. The course teacher should utilize a mic system which transcribes what he says to other languages to be displayed as captions on a screen for non-english speaker students to understand or for any students with struggles in hearing.
2. The course should have at least one or two program building group projects to provide students experience in collaborative coding.
3. The course should provide smaller scale optional coding assignments that isn't graded to provide students more opportunities to implement concepts and experiment.
4. The course should give a final project that allows students to express creativity by providing them liberty of making a program of their choice implenting concepts learned throughout the semester.
5. The course should utilize small group discussions in class for students to give feedback to each other how to problem solve an example program.

## Connecting with Available Data

The data you have available for this analysis is limited to the anonymized course survey you and your peers filled out a few weeks ago. The data is found in the `survey.csv` file in this exercise directory. Each row represents an individual survey response. Each column has a description which can be found on the project write-up here: <https://22s.comp110.com/exercises/ex08.html>

Review the list of available data and identify which one of your ideas _does not_, or is _least likely to_, have relevant data to support the analysis of your idea to create value. In the box below, identify which of your ideas lacks data and suggest how we might be able to collect this data in the future. One aspect of _continuous improvement_ is trying to avoid "tunnel vision" where possible improvements are not considered because there is no data available to analyze it. Identifying new data sources can unlock improvements!

### Part 2. Identifying Missing Data

1. Idea without sufficient data to analyze: 
- The course teacher should utilize a mic system which transcribes what he says to other languages to be displayed as captions on a screen for non-english speaker students to understand or for any students with struggles in hearing.

2. Suggestion for how to collect data to support this idea in the future: 
- Ask for any accomodations for in-class learning.

## Choosing an Idea to Analyze

Consider those of your ideas which _do_ seem likely to have relevant data to analyze. If none of your ideas do, spend a few minutes and brainstorm another idea or two with the added connection of data available on hand and add those ideas to your brainstormed ideas list.

Select the one idea which you believe is _most valuable_ to analyze relative to the others and has data to support the analysis of. In the markdown cell for Part 3 below, identify the idea you are exploring and articulate why you believe it is most valuable (e.g. widest impact, biggest opportunity for improvement, simplest change for significant improvement, and so on).

### Part 3. Choosing Your Analysis

1. Idea to analyze with available data:
- The course should provide smaller scale optional coding assignments that isn't graded to provide students more opportunities to implement concepts and experiment.
2. This idea is more valuable than the others brainstormed because: 
- We're expecting students to be coming in the course with no experience, so the course should provide as much opportunities to problem-solve coding as much as we can without putting the stress of grades and deadlines piling up on each other.


## Your Analysis

Before you begin analysis, a reminder that we do not expect the data to support everyone's ideas and you can complete this exercise for full credit even if the data does not clearly support your suggestion or even completely refutes it. What we are looking for is a logical attempt to explore the data using the techniques you have learned up until now in a way that _either_ supports, refutes, or does not have a clear result and then to reflect on your findings after the analysis.

Using the utility functions you created for the previous exercise, you will continue with your analysis in the following part. Before you begin, refer to the rubric on the technical expectations of this section in the exercise write-up.

In this section, you are expected to interleave code and markdown cells such that for each step of your analysis you are starting with an English description of what you are planning to do next in a markdown cell, followed by a Python cell that performs that step of the analysis.

### Part 4. Analysis

We begin by changing some settings in the notebook to automatically reload changes to imported files.

In [1]:
%reload_ext autoreload
%autoreload 2

We continue by importing the helper functions from `data_utils`.

In [2]:
from data_utils import read_csv_rows, columnar, head, select, count
from tabulate import tabulate

Next, establish referencing the path to the data file provided to analyze or to implement data wrangling.

In [3]:
SURVEY_DATA_CSV_FILE_PATH: str = "../../data/survey.csv"

First, I will analyze what data that is provided to me and find what specific data I will be able to work with to help support my idea.

In [4]:
data_rows: list[dict[str, str]] = read_csv_rows(SURVEY_DATA_CSV_FILE_PATH)

print(f"Data File Read: {SURVEY_DATA_CSV_FILE_PATH}")
print(f"{len(data_rows)} rows")
print(f"{len(data_rows[0].keys())} columns")
print(f"Columns names: {data_rows[0].keys()}")

Data File Read: ../../data/survey.csv
620 rows
35 columns
Columns names: dict_keys(['row', 'year', 'unc_status', 'comp_major', 'primary_major', 'data_science', 'prereqs', 'prior_exp', 'ap_principles', 'ap_a', 'other_comp', 'prior_time', 'languages', 'hours_online_social', 'hours_online_work', 'lesson_time', 'sync_perf', 'all_sync', 'flipped_class', 'no_hybrid', 'own_notes', 'own_examples', 'oh_visits', 'ls_effective', 'lsqs_effective', 'programming_effective', 'qz_effective', 'oh_effective', 'tutoring_effective', 'pace', 'difficulty', 'understanding', 'interesting', 'valuable', 'would_recommend'])


Seeing that there is data on students' prior experience, I will analyze the prior experiences course student's have to back my hypothesis of students coming in with lack of experience.

I will have to tranform the data to make it easier to analyze.

In [5]:
data_cols: dict[str, list[str]] = columnar(data_rows)

print(f"{len(data_cols.keys())} columns")
print(f"{len(data_cols['prior_exp'])} rows")
print(f"Columns names: {data_cols.keys()}")

35 columns
620 rows
Columns names: dict_keys(['row', 'year', 'unc_status', 'comp_major', 'primary_major', 'data_science', 'prereqs', 'prior_exp', 'ap_principles', 'ap_a', 'other_comp', 'prior_time', 'languages', 'hours_online_social', 'hours_online_work', 'lesson_time', 'sync_perf', 'all_sync', 'flipped_class', 'no_hybrid', 'own_notes', 'own_examples', 'oh_visits', 'ls_effective', 'lsqs_effective', 'programming_effective', 'qz_effective', 'oh_effective', 'tutoring_effective', 'pace', 'difficulty', 'understanding', 'interesting', 'valuable', 'would_recommend'])


After transforming the data table, I should analyze the first few rows to understand what informations the data table has.

In [6]:
data_cols_head: dict[str, list[str]] = head(data_cols, 5)

if len(data_cols_head.keys()) != len(data_cols.keys()) or len(data_cols_head["prior_exp"]) != 5:
    print("Oops, something is not right!")

tabulate(data_cols_head, data_cols_head.keys(), "html")

row,year,unc_status,comp_major,primary_major,data_science,prereqs,prior_exp,ap_principles,ap_a,other_comp,prior_time,languages,hours_online_social,hours_online_work,lesson_time,sync_perf,all_sync,flipped_class,no_hybrid,own_notes,own_examples,oh_visits,ls_effective,lsqs_effective,programming_effective,qz_effective,oh_effective,tutoring_effective,pace,difficulty,understanding,interesting,valuable,would_recommend
0,22,Returning UNC Student,No,Mathematics,No,"MATH 233, MATH 347, MATH 381",7-12 months,No,No,UNC,1 month or so,"Python, R / Matlab / SAS",3 to 5 hours,0 to 2 hours,6,2,2,1,2,4,4,0,7,3,7,5,,,1,1,7,5,6,5
1,25,Returning UNC Student,No,Mathematics,Yes,"MATH 130, MATH 231, STOR 155",None to less than one month!,,,,,,0 to 2 hours,5 to 10 hours,4,3,3,1,2,6,4,5,5,5,5,5,7.0,6.0,6,6,3,4,6,4
2,25,Incoming First-year Student,Yes - BA,Computer Science,No,"MATH 130, MATH 152, MATH 210",None to less than one month!,,,,,,3 to 5 hours,5 to 10 hours,3,3,4,2,1,7,7,2,5,6,7,7,4.0,,6,4,6,7,7,7
3,24,Returning UNC Student,Yes - BS,Computer Science,Maybe,"MATH 231, MATH 232, STOR 155",2-6 months,No,No,High school course (IB or other),None to less than one month!,Python,3 to 5 hours,3 to 5 hours,5,5,4,3,3,6,5,1,6,3,5,5,5.0,4.0,4,4,5,6,6,6
4,25,Incoming First-year Student,Yes - BA,Computer Science,No,MATH 130,None to less than one month!,,,,,,0 to 2 hours,3 to 5 hours,7,3,3,3,2,6,3,5,6,6,6,6,7.0,3.0,6,5,5,6,6,7


Now I will specifically look for the columns which would help me see if the data backs up my hypothesis that students come into the course with no to little experience.

In [7]:
selected_data: dict[str, list[str]] = select(data_cols, ["prior_exp", "ap_principles", "ap_a", "other_comp", "prior_time", "languages"])

tabulate(head(selected_data, 10), selected_data.keys(), "html")

prior_exp,ap_principles,ap_a,other_comp,prior_time,languages
7-12 months,No,No,UNC,1 month or so,"Python, R / Matlab / SAS"
None to less than one month!,,,,,
None to less than one month!,,,,,
2-6 months,No,No,High school course (IB or other),None to less than one month!,Python
None to less than one month!,,,,,
2-6 months,No,No,High school course (IB or other),1 month or so,"Python, Java / C#, JavaScript / TypeScript, HTML / CSS"


I will make a new function to ensure that I at least have one data for each row and skipping table rows that doesn't have the data I need.

In [22]:
from data_utils import no_data
skip_no_data: dict[str, list[str]] = no_data(data_cols, ["prior_exp", "ap_principles", "ap_a", "other_comp", "prior_time", "languages"])

tabulate(head(skip_no_data, 10), skip_no_data.keys(), "html")

prior_exp,ap_principles,ap_a,other_comp,prior_time,languages
7-12 months,No,No,UNC,1 month or so,"Python, R / Matlab / SAS"
None to less than one month!,,,,,
None to less than one month!,,,,,
2-6 months,No,No,High school course (IB or other),None to less than one month!,Python
None to less than one month!,,,,,
2-6 months,No,No,High school course (IB or other),1 month or so,"Python, Java / C#, JavaScript / TypeScript, HTML / CSS"


I will count how many students throughout the whole course have varying experiences to give me an overview.

In [8]:
prior_exp_counts: dict[str, int] = count(selected_data["prior_exp"])
print(f"prior_exp_counts: {prior_exp_counts}")

prior_time_counts: dict[str, int] = count(selected_data["prior_time"])
print(f"prior_time_counts: {prior_time_counts}")

prior_exp_counts: {'7-12 months': 59, 'None to less than one month!': 369, '2-6 months': 142, '1-2 years': 31, 'Over 2 years': 19}
prior_time_counts: {'1 month or so': 69, '': 369, 'None to less than one month!': 102, '7-12 months': 14, '2-6 months': 49, '1-2 years': 10, '> 2 years': 7}


Evident by the table and data provided above, students come into the course with almost no prior experience. This backs up my idea by telling us that student's should be given more opportunities to code.

## Conclusion

In the following markdown cell, write a reflective conclusion given the analysis you performed and identify recommendations.

If your analysis of the data supports your idea, state your recommendation for the change and summarize the data analysys results you found which support it. Additionally, describe any extensions or refinements to this idea which might be explored further. Finally, discuss the potential costs, trade-offs, or stakeholders who may be negatively impacted by this proposed change.

If your analysis of the data is inconclusive, summarize why your data analysis results were inconclusive in the support of your idea. Additionally, describe what experimental idea implementation or additional data collection might help build more confidence in assessing your idea. Finally, discuss the potential costs, trade-offs, or stakeholders who may be negatively impacted by experimenting with your idea.

Finally, if your analysis of the data does not support it, summarize your data analysis results and why it refutes your idea. Discuss the potential costs, trade-offs, or stakeholders who may be negatively impacted by this proposed change. If you disagree with the validity of the findings, describe why your idea still makes sense to implement and what alternative data would better support it. If you agree with the validity of the data analysis, describe what alternate ideas or extensions you would explore instead. 

### Part 5. Conclusion



I found that the analysis of the data supports my idea. Analyzing the prior experience and time in coding all together, I found that students have little to no experience at all. This advocates my idea for the change to provide additional optional and ungraded coding assignments, which helps give them more experience to those who wants to wants to have more examples in the concepts being learned. By having more practice opportunities, students would find it easier to understand the concept.


Since the objective of my idea is to provide students with more opportunities to garner experience in coding, we could top my idea off by connecting it with another of my provided idea. The course should have at least one or two program building group projects to provide students experience in collaborative coding. Students will be able to reflect what they learned or their understanding of concepts with each other, acting as some sort of peer-tutoring.


Potential stakeholders who may be negatively impacted are teacher and possibly TA's. It is on them to make these additional exercises and it is up to the students to take this opportunity, giving it a possibility that there would be students who wouldn't even make us of the additional exercises. There is a potential waste of time and effort on the part of the teacher and TA's. But, we should also take in consideration students who are willing to go out of the way to understand the course materials and to pursue their passion is coding.