# Who Are Our Rebels

In this notebook I'm going to use some simple NLP to try to explore who were our favorite rebels. In the process I hope to demonstrate some of the data-wrangling challenges that go along with NLP.

### Get Data from Canvas

Canvas has a RESTful API. I'm going to use it to pull down the responses to the homework assignments.

By the way, you can also use the Canvas API to access your data.

The cell below contains the code I used to get the data from Canvas.

In [None]:
import os
import json
with open(os.path.join(os.path.expanduser("~"), ".canvaslms", "quiz_token")) as f:
    token = f.read()
    
from canvasapi import Canvas
from bs4 import BeautifulSoup
import unicodedata

API_URL = "https://canvas.lms.unimelb.edu.au/"
canvas = Canvas(API_URL, token)
bec = canvas.get_user(canvas.get_current_user().id)


### List All the Courses I Can Access

In [None]:
for c in canvas.get_courses():
    print(c)

#### Pick one of them and List the Assignments

In [None]:
ehealth = canvas.get_course(110024)

In [None]:
for a in ehealth.get_assignments():
    print(a.id, a.name)

### Pick One of the Assignments

In [None]:
pex3 = ehealth.get_assignment(151403)

### Get the responses for this assignment
#### For yourself, you should only have one

In [None]:
pex3_responses = pex3.get_submissions()

In [None]:
for r in pex3_responses:
    print(r)

In [None]:
dir(r)

### Look at the last response

#### Each response consists of (potentially) multiple entries

In [None]:
len(r.discussion_entries)

In [None]:
for r in pex3_responses:
    print(len(r.discussion_entries), r.seconds_late)

### Each entry is a dictionary

In [None]:
r.discussion_entries[0].keys()

### This is a dicussion assignment

- Typed content is in the `message`

In [None]:
tmp = [[(e["created_at"], e["updated_at"] for, e["message"]) for r in pex3_responses for e in r.discussion_entries]

In [None]:
for t in tmp:
    print(t[0], t[1])

### Getting Date and Time Information

Parsing dates can be tedious, so I'm using a third-party package `python-dateutil` to make it easier

In [None]:
from dateutil.parser import *

In [None]:
for t in tmp:
    print((parse(t[1])- parse(t[0])).seconds)

### We could look at the readability of our answers

In [None]:
import readability

In [None]:
for t in tmp:
    print(readability.getmeasures(t[2], lang='en')['readability grades']['Kincaid'])

In [None]:
with open("pex3_repsonses.html", "w") as f:
    f.write("""<html><body>\n""")
    f.write("<h1>Compare and Contrast Experiences of Angela, Katie, and Lucy</h1>")
    f.write("""<h2>The Assignment</h2>
<p>As I hope you have learned through this class, healthcare systems and practices vary widely around the world and these differences are largely due to different cultural and political decisions made in each country. This variation is in addition to the biological differences amongst patients. Earlier in this class you met Angela and Rudi and learned a little bit about their experiences with a high risk pregnancy in the USA that ended in pre-term births of twins.</p>
<p>I recently interviewed two neighbors, one of whom recently had a baby here in Australia using the private health system while the other is nearing the end of her pregnancy and has been using the public health system. Their pregnancies have been much more uneventful than Angela's, although Covid-19 has disrupted their normal course of care. In the videos I think you will find that all three pregnancy experiences are quite different, with a substantial difference between the private and public Australian experience.&nbsp;</p>
<p>To help me better understand these differences, I reached out to a colleague at the University of Utah, <a href="https://healthcare.utah.edu/fad/mddetail.php?physicianID=u0976071&amp;name=lori-m-gawron" target="_blank" rel="noopener">Dr. Lori Gawron</a>, an obstetrician, and asked about standard obstetrics practice. My questions are in bold below</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" src="https://securembm.uuhsc.utah.edu/zeus/public/mbm-media/faculty-profile?facultyPK=u0976071" /></p>
<div>
<div class="">
<ul class="MailOutline">
<li class=""><strong>What maternal and fetal data are routinely collected during prenatal care? Are there data that are routinely observed during visits but not necessary recorded? </strong><span>While there are recommended aspects of prenatal care visits - the collection is variable based on the electronic health record or paper charting used. How they are collected also varies- even in EPIC [electronic health record] which has a number of prenatal flow sheet packages, not everyone uses them and often freetext notes. We routinely take vitals, weight, fetal heart rate, and fundal height, as the most reliably collected data and most evidence based. Everything else will be provider dependent as to whether or not they do it or document it, as data are less.&nbsp;</span></li>
<li class=""><strong>What maternal and infant data are routinely collected after birth? </strong><span>&nbsp;Both [inpatient and postpartum visits] may or may not use templates and inpatient is extremely variable based on delivery type and any complications. Postpartum- most people document vitals, weight, exam, postpartum depression results from screening tools, breastfeeding issues and contraception- again this may all be freetext.</span></li>
<li class=""><strong>Do you have any idea about how much variance there is in practice across developed countries?</strong> <span>The variance is completely&nbsp;related to the health system and payer mix but it's totally different country to country- England does a ton of home deliveries and with midwives, for example.</span></li>
</ul>

<p><span>After viewing Katie and Lucy's interviews, and refreshing your memory regarding Angela's video if needed, identify </span></p>
<ol>
<li><span>An information-related theme that seem to be common to all three experiences where an informatics solution might result in an improved patient experience.</span></li>
<li><span>Notable or surprising differences between the three experiences. These need not be information-centric.</span></li>
</ol>
<p><span>Provide a short, written description of your observations in the discussion thread.&nbsp;</span></p>
</div>
</div>  
<h2>The Responses</h2>
    """)
    for t in tmp:
        f.write(t[1]+"\n")
        f.write("<hr>\n")
    f.write("</body></html>")

In [None]:
pex3_text = [unicodedata.normalize("NFKC", BeautifulSoup(e[2]).getText()) for e in tmp]
pex3_text

- [Checking grammar with BERT and ULMFiT](https://towardsdatascience.com/checking-grammar-with-bert-and-ulmfit-1f59c718fe75)
- 