# Final Project: Getting Unstuck Reflections

## Framing

## Introduction
We have 3784 projects from [Getting Unstuck](https://www.gettingunstuck.gse.harvard.edu), a 21-day summer online learning experience from July 2018. These are projects that K-12 teachers made in Scratch, and each day they responded to a different prompt. Projects vary in size and scope; participants had varied levels of experience. 

## Research question(s)
* What did people learn? 
* How did they talk about what they learned? 
* How did reflecting support their learning?

## Hypotheses

    
### Hypothesis one: Learners who wrote more reflections had a deeper/multifaceted learning experience/their reflections supported their learning in meaningful ways.

* They participated more in the online community.
* Their computational fluency was deeper.
* They learned more aobut their own tendencies for "getting stuck."

#### Background:
We know from constructionist theories that environments that promote creating, sharing, reflecting, and personalizing can support learners in engaging in deeply meaningful work (Brennan, 2013). Often, the "reflecting" part is taken for granted, or seen as less essential when teachers have limited amounts of time. And yet, supporting teachers in becoming increasingly reflective practitioners can support productive growth of their teaching practice (Schön, 1987).

In what ways does reflecting help learners learn/help us as researchers understand what learners made? 


### Hypothesis two: Learners who tagged each other more often had a deeper learning experience.

* They participated more in the online community.
* Their computational fluency was deeper.
* They were able to support others (reciprocal tagging?)

#### Background:
From the communities of practice literature (Lave & Wenger, 1991), we know that learners who engage in legitimate peripheral participation in a community of practice can develop meaningfully from novices to experts. 

In the Getting Unstuck online community, did learners feel like they were in a community? Or did they operate by themselves, receiving the challenge emails each day and posting their projects? Did they interact with others' projects? How did those interactions support their learning? (See also: Illich, 1970).






## Results
    * how are you planning to test each hypothesis? What models are you thinking of using?
    * what are the best results you can hope for? Is that interesting / relevant for other researchers?
    * what are implications of your potential findings for practioners?
    
### Hypothesis One
I'm hoping to test the first hypothesis by doing some descriptive work to get a sense of what's in the data: sentiment analysis, topic modeling. 
* How many projects have written text accompanying them?
* Of those projects, how many include text that counts as "reflection"?
* Of those reflective projects, what did they talk about?

### Hypothesis Two
I'm hoping to perhaps build a dataframe that contains rows for each person who used tags, then build out columns/counts for the number of tags that happened? I'll need to sync up the usernames to author ids - must pull from another spreadsheet for that.

## Threads
    * Describe issues that might arise during the analyses above
    * Come up with backup plans in case you run into theses issues
    
### Issues
* Insufficient data in each project! (not enough projects)
* Can't get all the data from the API I need (comments, favorites, etc)
* How to measure computational fluency? One idea is to look at the Creative Computing Guide and say; teachers teaching with this "should" be able to use more than one sprite, more than one costume, and also be able to use loops, initialization, parallelism, and variables...

### Backup plans
* Still conduct a lot of descriptive analysis of the data and try a few different analytical angles, as well as writing a lit review of text analysis methods for reflective/student writing
* Expand analysis to include not only reflections (text data), but also Scratch projects (look at their blocks)
    
### Papers to explore
* Q. Liu, S. Zhang, Q. Wang and W. Chen, "Mining Online Discussion Data for Understanding Teachers Reflective Thinking," in IEEE Transactions on Learning Technologies, vol. 11, no. 2, pp. 243-254, 1 April-June 2018.
doi: 10.1109/TLT.2017.2708115. https://ieeexplore.ieee.org/document/7934007
* http://users.on.net/~kirsty.kitto/papers/lak15-gibson_kitto-short-FINAL.pdf 
* https://www.ajpe.org/doi/full/10.5688/ajpe80110
* https://www.researchgate.net/publication/328942550_Automatic_Reflective_Writing_Analysis_based_on_Semantic_Concept
* https://link.springer.com/article/10.1007/s40593-019-00174-2 
* https://dl.acm.org/citation.cfm?id=2883951
* https://dl.acm.org/citation.cfm?id=2883955 
* https://eric.ed.gov/?id=EJ1062704
* https://www.tandfonline.com/doi/abs/10.1080/07294360.2010.512627
* https://onlinelibrary.wiley.com/doi/abs/10.1046/j.1365-2923.2002.01227.x 


### Courses to explore
* http://web.stanford.edu/class/cs224n/
* https://monkeylearn.com/sentiment-analysis/
* Datacamp - NLP basics


## Data Exploration

Describe your raw data below; provide definition / explanations for the measures you're using

### Variable explanations
* **studio:** What studio/day each Scratch project is added to (though projects could've been added to the studio/started after everyone else, if a participant acted asynchronously); zero-indexed. 0 is studio 1, 20 is studio 21.
* **id:** The project id (randomly assigned number)
* **title:** Title of the project
* **description:** Online, this is labeled "Notes/Credits"
* **instructions:** Online, this is labeled "Instructions." Reflections are available in both, so one thought is to combine all the written text into one variable for each project, then look to filter out instructions.
* **author/id:** ID of the Scratcher; not linked to their Scratch username
* **image:** cover image for the project (arbitrary)
* **history/created:** date created (online; could've been made in the offline editor, so this is slightly inaccurate)
* **history/modified:** last date modified; doesn't show a history of modifications
* **history/shared:** date shared; likely shared after completion, but could be shared as a draft (and that was highly encouraged)
* **stats/views:** number of views (up to the day data was collected - Aug 2018)
* **stats/loves:** number of loves (up to the day data was collected - Aug 2018)
* **stats/favorites:** number of favorites (or stars; up to the day data was collected - Aug 2018)
* **stats/comments:** number of comments (currently inaccurate)
* **stats/remixes:** number of remixes (currently inaccurate)
* **stats/parent:** if remixed from another project, this column is filled with that project id
* **stats/root:** if remixed from another project, this column is filled with the project id of the root (if there was a string of remixes - see [Remix Tree](https://en.scratch-wiki.info/wiki/Remix#Remix_Trees).

In [10]:
import os

# using glob, find all the csv files in the "Studios" folder
import glob

files = glob.glob('./studios/*.csv')
print(files)



['./studios/studio19.csv', './studios/studio18.csv', './studios/studio20.csv', './studios/studio21.csv', './studios/studio8.csv', './studios/studio9.csv', './studios/studio7.csv', './studios/studio6.csv', './studios/studio4.csv', './studios/studio5.csv', './studios/studio1.csv', './studios/studio2.csv', './studios/studio3.csv', './studios/studio10.csv', './studios/studio11.csv', './studios/studio13.csv', './studios/studio12.csv', './studios/studio16.csv', './studios/studio17.csv', './studios/studio15.csv', './studios/studio14.csv']


In [11]:
# put all CSV files into a data frame

import pandas as pd 
import numpy as np
import re

dfs = []

# iterate over dataframe; create list of dataframes; add studio column
for i,filename in enumerate(files):
    filenum = re.findall(r'\d+',str(filename))
    filenum = int(filenum[0])-1
    df = pd.read_csv(filename, index_col=None, header=0)
    df.insert(0,'studio',filenum)
    dfs.append(df)
    
# concatenate list of dataframes
frame = pd.concat(dfs, axis=0, ignore_index=True, sort=False)

In [12]:
# check what the dataframe looks like

frame.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3784 entries, 0 to 3783
Data columns (total 24 columns):
studio                 3784 non-null int64
id                     3784 non-null int64
title                  3784 non-null object
description            2965 non-null object
instructions           3322 non-null object
author/id              3784 non-null int64
image                  3784 non-null object
history/created        3784 non-null object
history/modified       3784 non-null object
history/shared         3784 non-null object
stats/views            3784 non-null int64
stats/loves            3784 non-null int64
stats/favorites        3784 non-null int64
stats/comments         3784 non-null int64
stats/remixes          3784 non-null int64
remix/parent           316 non-null float64
remix/root             316 non-null float64
Unnamed: 16            2 non-null object
description (full)     2 non-null float64
instruction (full)     2 non-null float64
description (blank)    2 non

In [13]:
frame.head(10)

frame.tail(10)

Unnamed: 0,studio,id,title,description,instructions,author/id,image,history/created,history/modified,history/shared,...,stats/remixes,remix/parent,remix/root,Unnamed: 16,description (full),instruction (full),description (blank),instruction (blank),description.1,instructions.1
3774,13,237543092,Getting Unstuck Day 14,Made for #GettingUnstuck Day 14 #CreativeCompu...,Click the Green Flag and follow the instructions.,287832,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-27T07:49:43.000Z,2018-07-27T08:05:02.000Z,2018-07-27T07:55:50.000Z,...,0,,,,,,,,,
3775,13,237685360,Getting Unstuck: Day14,Code Club project 'Chatbot' helped a lot. Thi...,Click on the robot to start!\n\nChat with the ...,34197892,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-29T13:53:18.000Z,2018-07-29T21:25:27.000Z,2018-07-29T21:18:15.000Z,...,0,,,,,,,,,
3776,13,237765948,Adivina el animal Getting unstuck day 14,I have learned to count letter positions in wo...,Juego para adivinar nombres de animales. \nInt...,2680901,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-30T14:49:51.000Z,2018-07-30T15:39:53.000Z,2018-07-30T15:33:13.000Z,...,0,,,,,,,,,
3777,13,237771675,getting unstuck day 14,I learn to use join\n,,23020553,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-30T16:04:24.000Z,2018-08-03T17:47:19.000Z,2018-07-30T16:12:40.000Z,...,0,,,,,,,,,
3778,13,237823642,#unstuck Day 14 string blocks,I had to look at a video clip o how to string.,Start with clicking on the microphone.,31492686,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-31T02:56:38.000Z,2018-07-31T07:18:18.000Z,2018-07-31T07:18:18.000Z,...,0,,,,,,,,,
3779,13,237860907,Io trovo le lettere,Getting Unstuck 14\nCreare \nCreare un progett...,rispondi,21462276,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-31T13:37:38.000Z,2018-07-31T14:02:51.000Z,2018-07-31T14:01:14.000Z,...,0,,,,,,,,,
3780,13,237999490,Getting Unstuck - Day 14,The challenge for day 14 is to create a projec...,Click the green arrow and enter your name when...,24696341,https://cdn2.scratch.mit.edu/get_image/project...,2018-08-01T22:09:01.000Z,2018-08-01T23:04:55.000Z,2018-08-01T23:01:32.000Z,...,0,,,,,,,,,
3781,13,237792845,GETTING UNSTUCK14: FIGURES,,"PRESS THE GREEN FLAG\nPLEASE, WRITE IN BLOCK L...",16573108,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-30T20:32:00.000Z,2018-08-06T21:50:05.000Z,2018-08-06T21:50:05.000Z,...,0,,,,,,,,,
3782,13,238433917,day 14,Create a project that uses the string blocks i...,,24618980,https://cdn2.scratch.mit.edu/get_image/project...,2018-08-07T09:55:31.000Z,2018-08-07T15:51:56.000Z,2018-08-07T10:06:28.000Z,...,0,,,,,,,,,
3783,13,238271691,Getting Unstuck - Day 14,"When I was younger, I would always create stag...",Press the green button and wait to hear instru...,25334812,https://cdn2.scratch.mit.edu/get_image/project...,2018-08-05T12:55:06.000Z,2018-08-07T15:51:11.000Z,2018-08-07T15:49:40.000Z,...,0,,,,,,,,,


## Data Cleaning

Clean your data in this section, and make sure it's ready to be analyzed for next week!

In [14]:
#drop the unnecessary columns
badColumns = ["stats/comments","stats/remixes","Unnamed: 16",
              "description (full)","instruction (full)","description (blank)",
              "instruction (blank)","description.1","instructions.1"]

for badColumn in badColumns:
    frame = frame.drop(badColumn,axis='columns')
    
frame.head(10)

Unnamed: 0,studio,id,title,description,instructions,author/id,image,history/created,history/modified,history/shared,stats/views,stats/loves,stats/favorites,remix/parent,remix/root
0,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,27,16,1,,
1,18,237100206,Grumpy Bubbles,"Big thanks to @jsh for his Day 1 project, Rive...","First, check out @jsh's River Waltz: https://s...",39526,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-23T00:25:41.000Z,2018-07-23T10:07:59.000Z,2018-07-23T10:07:59.000Z,29,11,0,235484400.0,235484400.0
2,18,237093421,Raindrops,I've found that I don't always initially consi...,Press the green flag.,25705937,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T22:22:17.000Z,2018-07-23T13:33:50.000Z,2018-07-23T10:37:53.000Z,29,18,4,,
3,18,237098671,Night Rain,After simulating rain drops for today's challe...,"Press the green flag, allow for the microphone...",25705937,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-23T00:06:28.000Z,2018-07-23T11:03:43.000Z,2018-07-23T10:38:58.000Z,21,8,4,,
4,18,237111696,Lights,This project was created for Day 19 of Getting...,Enjoy the procession of the clones!,56239,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-23T03:16:00.000Z,2018-07-23T17:44:58.000Z,2018-07-23T10:59:22.000Z,30,14,7,,
5,18,62591162,Slim Cantore - Penquin Weather Channel Event,\n\n,This project is based on a Weather Story by We...,4615776,https://cdn2.scratch.mit.edu/get_image/project...,2015-05-16T05:27:51.000Z,2018-07-23T11:12:00.000Z,2015-05-21T00:13:38.000Z,32,9,2,24001065.0,24001065.0
6,18,237136651,Unstuck Day 19 Flowers,Three flowers on one stem is a variation from ...,Thanks for today's challenge! I have used clon...,14632339,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-23T10:50:19.000Z,2018-07-23T19:44:57.000Z,2018-07-23T11:04:11.000Z,9,6,0,,
7,18,237134693,Unstuck Day 19: Using Clones,I did this project for Getting Unstuck challen...,"Play with ripples! \nJust added sound, too.\nS...",214174,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-23T10:11:35.000Z,2018-07-23T12:17:09.000Z,2018-07-23T11:32:06.000Z,17,10,3,,
8,18,237138514,Day19,,Click on the green flag to start,34225248,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-23T11:27:54.000Z,2018-07-23T11:38:35.000Z,2018-07-23T11:38:22.000Z,12,4,0,,
9,18,237138676,#019 Getting Unstuck Clones,I didn't have a lot of time to do this today.....,Move mouse around for clones...,20126730,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-23T11:31:36.000Z,2018-07-23T11:44:12.000Z,2018-07-23T11:41:27.000Z,21,5,0,,


In [15]:
# import all_studios CSV to get the author ID


dfAuthor = pd.read_csv('all_studios.csv', index_col=None, header=0)

dfAuthor.head()

# dfAuthor.info()


Unnamed: 0,project_id,project_author_id,project_author_username,studio_numberscript_count,variable_count,list_count,comment_count,costume_count,sprite_count,block_count,block_unique_count,random_block_count,Unnamed: 12
0,235615218,34197892,33limekilnsja,1,4,0,0,0,6,2,19,8,0
1,235715853,34197892,33limekilnsja,2,2,0,0,0,3,1,9,8,2
2,235762150,34197892,33limekilnsja,3,3,1,0,0,12,3,13,11,0
3,235832649,34197892,33limekilnsja,4,2,0,0,0,7,4,18,11,0
4,235934422,34197892,33limekilnsja,5,3,0,0,0,6,2,17,12,0


In [None]:
# Look for the author username; add column to dataframe. 

# frame.head()

# match the columns
dfAuthor = dfAuthor.rename(index=str, columns={"project_author_id": "author/id"})

dfAuthor.head()


# THIS PART ISN'T WORKING - THE MERGE IS OVERWRITING ALL THE ROWS SOMEHOW.
# merged_df = frame.merge(dfAuthor, how = 'inner', on = ['author/id'])

# merged_df.head()

# merged_df.info()

# merged_df = merged_df.drop(['Unnamed: 12'],axis=1)

In [19]:
merged_df.head()

Unnamed: 0,studio,id,title,description,instructions,author/id,image,history/created,history/modified,history/shared,...,studio_numberscript_count,variable_count,list_count,comment_count,costume_count,sprite_count,block_count,block_unique_count,random_block_count,Unnamed: 12
0,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,1,28,1,0,0,35,6,137,38,4
1,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,2,7,2,0,1,3,2,41,25,2
2,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,3,8,1,0,0,19,3,32,23,2
3,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,4,15,2,2,2,12,2,117,36,0
4,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,5,12,6,2,1,9,3,118,43,4


In [9]:
# Merge the two columns of stuff people wrote

merged_df["writing"] = merged_df["description"].map(str) + merged_df["instructions"]

merged_df.head()


Unnamed: 0,studio,id,title,description,instructions,author/id,image,history/created,history/modified,history/shared,...,studio_numberscript_count,variable_count,list_count,comment_count,costume_count,sprite_count,block_count,block_unique_count,random_block_count,writing
0,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,1,28,1,0,0,35,6,137,38,Animation is hard! And time consuming! But aft...
1,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,2,7,2,0,1,3,2,41,25,Animation is hard! And time consuming! But aft...
2,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,3,8,1,0,0,19,3,32,23,Animation is hard! And time consuming! But aft...
3,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,4,15,2,2,2,12,2,117,36,Animation is hard! And time consuming! But aft...
4,18,237042971,Steven and the Stevens [MV],Animation is hard! And time consuming! But aft...,Press the green flag to watch! (Inspired by St...,2745846,https://cdn2.scratch.mit.edu/get_image/project...,2018-07-22T00:59:37.000Z,2018-07-23T09:27:01.000Z,2018-07-23T09:25:28.000Z,...,5,12,6,2,1,9,3,118,43,Animation is hard! And time consuming! But aft...


## References

Brennan, K. (2013). Best of both worlds: Issues of structure and agency in computational creation, in and out of school. (Doctoral thesis). Massachusetts Institute of Technology, Cambridge, MA.   

Illich, I. (1970). Deschooling Society. 

Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation (Vol. 521423740): Cambridge university press Cambridge.

Schön, D. A. (1987). Educating the reflective practitioner: Toward a new design for teaching and learning in the professions. 
