# Setup Tasks - Getting Help

We are here to help you with all aspects of this exercise. These include: following the instructions, designing your pseudocode and real code, and debugging your real code. Do not feel shy about asking for help. Our goal is for you to finish the exercise. 

Help is available through Piazza and Zoom.

Go to this [_link_](https://piazza.com/unc/fall2023/cs102) to signup for the Piazza course associated with this study. 


Start a Zoom meeting and turn on the recording feature. Select the **"Record on this Computer"** option for recording to keep your recording private. Also, for privacy reasons, **do not turn on video**. One of the goals of this study is to determine the kind of problems people have doing the exercise tasks and how to semi-automatically provide help. This goal is why we request you record your Zoom session now and share the recording with us later. Your meeting recording will be kept private in a secure UNC Google Drive folder. 

Send a private post to instructors with your Zoom meeting link. If you think the problem can be solved using Piazza, post a private post with a problem description to the instructors. If you think the problem cannot be solved using Piazza or your Piazza post was not sufficient to solve your problem, then make a private Piazza post to instructors asking one of them to join your meeting.

# Python Programming and Various Solutions

A classic problem in data science is [counting the number of occurrences of unique words in a string](https://www.google.com/search?client=safari&rls=en&q=map+reduce+word+count&ie=UTF-8&oe=UTF-8). 

Another classic problem in most forms of text processing is breaking up a string into its component words. 

This exercise will focus on a simpler version of these two problems: counting the number of words in a string.

Some of the research objectives of this session are: 
- Determining the pedagogical value of showing students multiple solutions to the same problem.
- Determining the effort required to convert pseudocode solutions generated by ChatGPT into actual code. 
- Giving us data to automatically determine which kind of solution someone used to solve the problem. 

Some of the learning objectives of this session are: 
- Giving Python programming practice to students with an introductory experience in Python. 
- Showing you that there are multiple solutions to the same problem.
- Giving practice converting pseudocode written in natural language to actual Python code. 
- Giving you practice with the tutoring task of critiquing and improving the coding style of an existing solution to a problem submitted by a student.
- Giving you practice with the tutoring task of giving pseudocode (in natural language) rather than actual code to a student struggling with a problem algorithm.  
- Allowing you to compare pseudocode created manually by you with that generated using ChatGPT.

This exercise assumes you have done some introductory programming in Python using Jupyter Notebooks. It involves alternative and pseudocode solutions to a programming problem. You will play the roles of both a student and a tutor.  

## Setup Tasks - Load Packages

We will begin by loading packages that are necessary for the exercise. Run the cell below to load these packages. Recall `Shift + Enter` is a shortcut in Jupyter to run a cell. 

In [None]:
# Run this cell to install packages. 
# Your Kernel will restart after running this cell. 
%pip install requests

get_ipython().kernel.do_shutdown(restart=True) 

In [None]:
# Run this cell
%autosave 5

import IPython
from StudyHelper import CheckWordCount,StudySetup
# Note - If you get a 'Javascript Error: IPython is not defined' output, just ignore it.

## Setup Tasks - Privacy & Study Consent

To help organize the exercise data, please provide a **chosen_id** and 
**course_taken** in the cell below. 

For **chosen_id**, provide a fake name that **does not identify you**. **Your fake name should only contain letters and numbers. No spaces or symbols!** This name will be associated with a cloud log produced during the exercise so that we can corelate actions from the same person. Later, we will ask you to deposit this log in a secure Google Drive folder and associate your real name with the log and your fake name. 

You should click and read the information at the following link as part of consenting to participating in this exercise. 

[_IRB Consent Form_](https://drive.google.com/file/d/16vdgMUZNXV5a-DkLFct1d3J10kqj3bSt/view?usp=sharing)

If you consent to the conditions in the linked form, change the **consent** variable below from `False` to `True`. If you do not consent to the conditions, **do not proceed with the exercise**. If you do proceed without consenting, your logged actions will not be used in any research unless you give consent later by signing the IRB consent form linked above. 

In addition, if you are part of a course, give **course_taken** in the format "\<Course\>\<Semester\>\<Year\>" each starting with a capital letter. 
    
For example, if you are taking Comp 116 in Fall 2023, set **course_taken** to "Comp116Fall2023".

Below are values for someone named "jsmith" taking Comp 116 in Fall 2023. 
    
chosen_id="jsmith"

course_taken="Comp116Fall2023"

If you are not doing this exercise as part of a course, you do not have to edit **course_taken**. 

In [None]:
# Edit the below variables and run this cell. 

# Use a fake name, we will add a random number to ensure it is unique. 
chosen_id="FakeName123"

# Do not change
course_taken='Comp116-Fall-2023'

# Change to False if you do not consent to the IRB agreement and want to disable logging. 
consent = True

## Setup Tasks - Logging Startup

Run the below cell to start logging your actions in the cloud. 

If you take a break, that is fine. Just re-run the notebook from the top to the cell below.

In [None]:
# Run this cell before working on study.

notebook = "Study_Notebook_Python_Word_Count.ipynb"
study_identifier = "Lab_Study_Python_Word_Count_Standalone"
StudySetup(chosen_id, course_taken, notebook, study_identifier)

##  Setup Tasks - Exercise Non-Coding Answers 

To submit your answers to the non-coding questions, navigate to the URL below and follow along with the instructions given. 

https://forms.gle/SHEhcHCVU42UF2yJ8

**Note**: Answer all **non-coding** questions in the google form. **Keep this form open** for the remainder of the exercise. 

# Python Word Count

A classic problem in data science is [counting the number of occurrences of each unique word in a string](https://www.google.com/search?client=safari&rls=en&q=map+reduce+word+count&ie=UTF-8&oe=UTF-8). 

In this exercise, you will do a simpler version of this problem: counting the number of words in a string.

## Question 1 - Programming Problem

First, you will write Python code and pseudocode to solve the word count problem. This coding will give you experience writing code and pseudocode. 

### Question 1.1 - Python Code

Write a function `word_count` that returns `num_words`, the number of words in a string. 
**You can assume there is no space within a word.**

**Note:** Some of the relevant Python mechanisms are [_for_](https://www.w3schools.com/python/python_for_loops.asp), [_if_](https://www.w3schools.com/python/python_conditions.asp), [_string indexing_](https://www.w3schools.com/python/gloss_python_for_string.asp), and the string methods: [_count()_](https://www.w3schools.com/python/ref_list_count.asp), [_split()_](https://www.w3schools.com/python/ref_string_split.asp), and [_endswith()_](https://www.w3schools.com/python/ref_string_endswith.asp). 

Solve this problem by filling in the function below. 
 

In [None]:
# Question 1.1 - Python Code
# Fill in your answer to the programming problem below. 

def word_count_own(temp_text):
    '''Return the number of words'''
    # num_words: the number of words in the string temp_text
    return num_words

In [None]:
# Run this cell to check your solution.

test_text = "This sentence will be used to check your answer. It is not a long sentence."

print("Checking Question 1.1 - Python Code")
CheckWordCount(word_count_own, test_text)

### Question 1.2 - English Pseudocode/Comment

Imagine you are a tutor (undergraduate assistant, graduate assistant, or instructor) trying to help a student reach the coding solution you gave above. Write a pseudocode description in English that you would give to that student of a word count function that returns the number of words in a string. The description should be in English and contain the information you would put in a comment describing the coding algorithm. 

Ideally, in writing the pseudocode, you should try to meet the following goals. 

The pseudocode should help you separate the underlying algorithm from the implementation. Moreover, it should contain enough detail to help a student stuck on the problem, yet, not so much detail that converting it into real code would not be a valuable learning exercise. 

You may not be able to achieve these goals simultaneously. We later ask you how well you think you met these goals. 

Give your pseudocode in the **Google Form**.

Given the pseudocode you were able to compose above, how strongly do you agree with the given statements? Give answers in the **Google Form**. 

### Question 1.3

Writing the pseudocode above was a valuable learning exercise as it significantly increased my understanding of separating the underlying algorithm from the implementation. 


### Question 1.4

If a student is stuck on the given problem and has no clue what to do, converting the pseudocode description above to real code would be a valuable learning exercise.  


### Question 1.5 - 1.9 - Additional Pseudocode Solutions

If the solution you gave above was the only solution you considered, skip to **Question 1.10**. 

In the following questions, give pseudocode for the alternative solutions you considered. You may fill in up to five solutions (**Question 1.5 - Question 1.9**). Again, assume you are a tutor giving this pseudocode to a student or commenting on the coding algorithm.  

Give your answers to these questions in the **Google Form**.


### Question 1.10 - Pseudocode Value

This next question aims to determine the value of writing pseudocode for algorithms you did not implement in real code. 

How strongly do you agree with the given statement? Give answers in the **Google Form**.

Translating the algorithm in your head into pseudocode was a valuable learning exercise. 

## Question 2 - Code from Psuedocode 

In the following section, we will give you pseudocode for three popular algorithms to solve the word count problem given by students in a data science programming course. ChatGPT automatically generated this pseudocode from these solutions. You may have written real code for one of these algorithms and given pseudocode for some of them. 

The goals of the following questions are to determine (1) the value of seeing pseudocode of an algorithm you have not considered, (2) how much you learn from convering pseudocode into real code, and (3) comparing pseudocode generated by ChatGPT with human generated pseudocode. 

You will first write real code from a pseudocode description and then answer some questions. 

Fill in the `word_count` method in the cells below each description. 


### Question 2.1 - Split Approach - ChatGPT Description

Pseudocode Description:

To count the number of words, the function first splits the string using the `split()` method, which divides the string into a list of words based on spaces. Then, it uses the `len()` function to determine the number of words in the resulting list, which is equivalent to the number of words in the original string. Finally, the function returns this number.

### Question 2.1 - Split Approach - Code Solution 

In [None]:
# Question 2.1 - Split Approach - Code Solution
# Write your code for the split solution here.
# If you gave this solution earlier, just reproduce it. 

def word_count_split(temp_text):
    '''Return the number of words'''
    # num_words: the number of words in the string temp_text
    return num_words

In [None]:
# Run this cell to check your solution.

test_text = "This sentence will be used to check your answer. It is not a long sentence."

print("Checking Question 2.1 - Split Approach - Code Solution")
CheckWordCount(word_count_split, test_text)

### Question 2.2 - Count Approach - ChatGPT Description

Pseudocode Description:

To count the number of words, the function first uses the `count()` method on the string to count the number of spaces in the string. This will give us the number of gaps between the words in the string. Adding 1 to this count gives us the number of words in the string, since the number of words is always one more than the number of gaps between them.

### Question 2.2 - Count Approach - Code Solution

In [None]:
# Question 2.2 - Count Approach - Code Solution
# Write your code for the count solution here.
# If you gave this solution earlier, just reproduce it.

def word_count_count(temp_text):
    '''Return the number of words'''
    # num_words: the number of words in the string temp_text
    return num_words

In [None]:
# Run this cell to check your second solution.

test_text = "This sentence will be used to check your answer. It is not a long sentence."

print("Checking Question 2.2 - Count Approach - Code Solution")
CheckWordCount(word_count_count, test_text)

### Question 2.3 - Loop Approach - ChatGPT Description

Pseudocode Description:

To count the number of words, the function iterates through each character in the string using a `for` loop. For each character, it checks if it is a space character (represented by the string `" "`). If the character is a space, the function increments a counter variable by 1. This way, the function counts the number of spaces in the string, which is equivalent to the number of words minus 1.

The function starts with a counter of 1 and increments it for each space character it finds, it will end up with the correct count of words in the string, including the first word which doesn't have any space before it.

### Question 2.3 - Loop Approach - Code Solution

In [None]:
# Question 2.3 - Loop Approach - Code Solution
# Write your code for the loop solution here.
# If you gave this solution earlier, just reproduce it.

def word_count_loop(temp_text):
    '''Return the number of words'''
    # num_words: the number of words in the string temp_text
    return num_words

In [None]:
# Run this cell to check your third solution.

test_text = "This sentence will be used to check your answer. It is not a long sentence."

print("Checking Question 2.3 - Loop Approach - Code Solution")
CheckWordCount(word_count_loop, test_text)

## Question 3 - Code from Pseudocode Feedback 

You just converted pseudocode for the popular algorithms to solve the word count problem into real code. The following questions interrogate the pedagogical value of that process and ask you to compare your pseudocode to the pseudocode of ChatGPT. 

How strongly do you agree with the given statements? Give answers in the **Google Form**.

### Question 3.1

Seeing these ChatGPT descriptions significantly increased my understanding of how pseudocode should be written as a tutor helping a student with an algorithm and as a programmer documenting my algorithms. 

### Question 3.2 

My pseudocode descriptions met the two pseudocode requirements (helps you separate the underlying algorithm from the implementation and contain enough detail to help a student stuck on the problem) better than the ChatGPT descriptions.

### Question 3.3 

Seeing the pseudocode was a valuable learning experience for algorithms I had not considered. 

### Question 3.4

Converting pseudocode to real code was a valuable learning experience for algorithms I had not coded. 


## Question 4 - Setup

You just saw some popular algorithms to solve the word count problem. We show you below two obscure improvable algorithms that correctly solve the problem. The goal here is twofold, (1) we want to understand how much effort is required to understand these obscure solutions, and (2) we want your opinion on what kind of feedback, if any, should be given to those that write these solutions. 

Study the solution labeled as **S4** and try to understand how it works. 

### S4 - Solution 4 - Code

In [None]:
# Question 4 - Code Solution
def word_count_s(temp_text):
    '''Return the number of words'''
    
    num_words = 0
    for word in temp_text.split():
        num_words += len(word.split())
    
    # num_words: the number of words in the string temp_text
    return num_words

In [None]:
# Run this cell to confirm the above solution works.

test_text = "This sentence will be used to check your answer. It is not a long sentence."

print("Checking Question 4 - Code Solution")
CheckWordCount(word_count_s, test_text)

## Question 4 - Style Feedback

This solution does produce the correct return value, but the solution's style could improve by implementing **either** of the given feedback statements. 

1. Consider replacing the `len(word.split())` with `1` because `len(word.split())` will always evalute to `1` when used in this `for` loop context. 


2. Consider removing the `for word in temp_text.split():` and replacing the string used in the `len(word.split())` statement. 

## Question 4

Assume that you are a student who submitted **Solution 4**. Answer how strongly you agree with the following statements. Give answers in the **Google Form**. 

### Question 4.1 

I would have benefited from the individual style feedback before submitting the assignment. 

### Question 4.2 

Assuming it is impossible or desirable to get individual style feedback before submitting, I would have benefited from seeing this feedback after the assignment is submitted and graded. 

### Question 4.3 

Assuming I did not get individual style feedback before or after I submitted the assignment, I should have gotten fewer points for giving this solution.

## Question 5 - Setup 

Now, you will assume the role of an LA and get practice with critiquing and improving the coding style of an existing solution to a problem submitted by a student.

Examine the solution to `word_count` labeled as **Solution 5** given below. Try to understand the algorithm used to solve the problem. 

### S5 - Solution 5 - Code 

In [None]:
# Question 5 - Code Solution
def word_count_c(temp_text):
    '''Return the number of words'''
    
    for x in range(len(temp_text)):
        sentence=temp_text.split()
        num_words=len(sentence)
    
    # num_words: the number of words in the string temp_text
    return num_words

In [None]:
# Run this cell to confirm the above solution works.

test_text = "This sentence will be used to check your answer. It is not a long sentence."

print("Checking Question 5 - Code Solution")
CheckWordCount(word_count_c, test_text)

## Question 5 - Potential Style Feedback 

Below are five style feedback statements. Some of these feedback statements **may not be correct**. Read the feedback carefully. 

1. Consider removing the `for x in range(len(temp_text))` and replacing with `for x in temp_text` so you don't have to reference characters by index variable `x`. 

2. Consider removing the `for x in range(len(temp_text))` completely since the value of `x` is not used in the given `for` loop. 

3. Consider renaming `x` in `for x in range(len(temp_text))` to `index` since `x` represents an index position in the string `temp_text`. 

4. Consider removing `sentence=temp_text.split()` and replacing with `if temp_text[x] == ' ':`. Also change `num_words=len(sentence)` to `num_words +=1`. 

5. Consider moving `num_words=len(sentence)` after the `for` loop and not inside the `for` loop. 



## Question 5

Assuming you were a tutor, answer the following based on **Solution 5** and the corresponding **Style Feedback**. Give answers in the **Google Form**. 

### Question 5.1 

Which style advice do you think it is appropriate to give to improve the style of **Solution 5**? You may choose more than one. Assume **only one** piece of advice is given as feedback. 

### Question 5.2 

As a human, what advice would you give to a person to correct the style of **Solution 5**. Give as many details as possible in English (without writing code).  

## Question 6 - Setup

One of the goals of this research is to automatically group solutions based on the algorithm used. The following questions ask, as a tutor, if you would like to see information about the nature of these groups so that you can give group feedback. 

Imagine we took the solutions to the abovementioned problem and defined them in two different categories. These categories are as follows: 

- Canonical Solutions - These are solutions that use **either** the **Split Approach**, **Count Approach**, or **Loop Approach**. Solutions **1,2, and 3** are canonical solutions.

- Hybrid Solutions - These are improvable solutions and use a **combination** of `split()`, `count()`, and `for/if`. Solutions **4 and 5** are hybrid solutions. 

## Question 6

As a tutor, assume we want to understand how many students gave **Canonical** vs. **Hybrid** solutions. We ask you the following question to determine how surprising the division between canonical and hybrid would be to you. 

Give an answer to the following question in the **Google Form**. 


### Question 6.1 

What percentage of students would give **Canonical** vs. **Hybrid** solutions if this problem was given on an assignment or during an exercise? 

## Question 7


Finally, assume that as a tutor, you notice many students submitted a hybrid solution. Answer how strongly you agree with the following statements. Give answers in the **Google Form**. 


### Question 7.1 

If there was enough time in class and/or recitation, you think it would be valuable to address some of the hybrid solutions to improve student coding style.

### Question 7.2 

I would create a solution guide document describing some hybrid solutions and how their style could be improved.

## Question 8 

Having reached the end of the lab study, the last set of questions will focus on your experience during the lab study. Give answers in the **Google Form**. 

### Question 8.1 

Did this lab study give you any new ideas about how you would write pseudocode in the future? 

### Question 8.2 

Did this lab study give you any new ideas about how you would write real code in the future? 

### Question 8.3 

Did this lab study give you any new ideas about how you would tutor for a coding course in the future? 

### Question 8.4 

Would you recommend the exercises in this lab study to a person with a similar background to your own? 

### Question 8.5 

We would really appreaciate any additional impressions/thoughts you have about the exercises in this lab study. You can elaborate on your previous answers in paragraph form or comment on issues that we did not address in the study. 

## End Exercise

**Be sure to save your notebook a final time after filling in your final answers.** 

**Close your Zoom meeting just before you submit your final answers.**

Follow submission instructions in the **Google Form** carefully. 