# Pre-processing Pipeline

In [1]:
#!/Users/mattmann/git/buildout.python/python-3.7/bin/pip3.7 install tqdm spacy scispacy spacy_langdetect https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.2.3/en_core_sci_lg-0.2.3.tar.gz

In [2]:
import spacy 
import scispacy
import pandas as pd 
import os
import numpy as np
import json
from tqdm.notebook import tqdm
from scipy.spatial import distance
import ipywidgets as widgets
from scispacy.abbreviation import AbbreviationDetector
from spacy_langdetect import LanguageDetector

# UMLS linking will find concepts in the text, and link them to UMLS. 
from scispacy.umls_linking import UmlsEntityLinker
import time

# Time for NLP!

Let's load our language model. Based on the type of text we'll be dealing with, we want something that's been pretrained on biomedical texts, as the vocabulary and statistical distribution of words is much different from, say, the news or Wikipedia articles. Luckily, there's already pre-trained models for spacy, so let's load the largest one we can! 

In [3]:
nlp = spacy.load("en_core_sci_lg", disable=["tagger"])
#nlp = spacy.load("en_core_web_sm")
# If you're on kaggle, load the model with the following, if you run into an error:
#nlp = spacy.load("/Users/mattmann/git/buildout.python/python-3.7/lib/python3.7/site-packages/en_core_sci_lg/en_core_sci_lg-0.2.3/", disable=["tagger"])

# We also need to detect language, or else we'll be parsing non-english text 
# as if it were English. 
nlp.add_pipe(LanguageDetector(), name='language_detector', last=True)

# Add the abbreviation pipe to the spacy pipeline. Only need to run this once.
abbreviation_pipe = AbbreviationDetector(nlp)
nlp.add_pipe(abbreviation_pipe)

# Our linker will look up named entities/concepts in the UMLS graph and normalize
# the data for us. 
linker = UmlsEntityLinker(resolve_abbreviations=True)
nlp.add_pipe(linker)



### Adding a vector for COVID-19

One last thing. COVID-19 is a new word, and doesn't exist in the vocabulary for our spaCy model. We'll need to add it manually; let's try setting it to equal the average vector of words that should represent what COVID-19 refers to, and see if that works. I'm not an expert so I just took definitions from Wikipedia and the etiology section of https://onlinelibrary.wiley.com/doi/full/10.1002/jmv.25740. There's a much better way of doing this (fine-tuning the model on our corpus) but I have no idea how to do this in spaCy...

In [4]:
from spacy.vocab import Vocab
new_vector = nlp(
               """Single‐stranded RNA virus, belongs to subgenus 
                   Sarbecovirus of the genus Betacoronavirus.5 Particles 
                   contain spike and envelope, virions are spherical, oval, or pleomorphic 
                   with diameters of approximately 60 to 140 nm.
                   Also known as severe acute respiratory syndrome coronavirus 2, 
                   previously known by the provisional name 2019 novel coronavirus 
                   (2019-nCoV), is a positive-sense single-stranded RNA virus. It is 
                   contagious in humans and is the cause of the ongoing pandemic of 
                   coronavirus disease 2019 that has been designated a 
                   Public Health Emergency of International Concern""").vector

vector_data = {"COVID-19": new_vector,
               "2019-nCoV": new_vector,
               "SARS-CoV-2": new_vector}

vocab = Vocab()
for word, vector in vector_data.items():
    nlp.vocab.set_vector(word, vector)

### Sanity Check
Alright, let's check if this work. 

In [5]:
print(
    nlp("COVID-19").similarity(nlp("novel coronavirus")), "\n",
    nlp("SARS-CoV-2").similarity(nlp("severe acute respiratory syndrome")), "\n",
    nlp("COVID-19").similarity(nlp("sickness caused by a new virus")))

0.5324297344523997 
 0.34796126970622626 
 0.7016356811120861


I guess we'll find out if that's good enough for our purposes! Let's save it so other people can use it!

In [6]:
nlp.to_disk('coronawhy/covid-19-en_lg')

Some of the texts is particularly long, so we need to increase the max_length attribute of nlp to more then 1.25mil. The alternative would be cutting the length of the article or dropping it entirely (I believe there's some sort of anomaly with this particular article), but we'll keep it for now. 

In [7]:
nlp.max_length=2000000

Next, we want to replace all abbreviations with their long forms. This is important for semantic indexing because the model has probably seen words like "Multiple sclerosis" but may have seen the abbreviation "MS" in different contexts. That means their vector representations are different, and we don't want that! 

So here we'll add the abbreviation expansion module to our scispaCy pipeline. 

In [8]:
doc = nlp("Attention deficit disorcer (ADD) is treated using various medications. However, ADD is not...")

print("Abbreviation", "\t", "Definition")
for abrv in doc._.abbreviations[0:10]:
	print(f"{abrv} \t ({abrv.start_char}, {abrv.end_char}) {abrv._.long_form}")

Abbreviation 	 Definition
ADD 	 (80, 83) Attention deficit disorcer
ADD 	 (28, 31) Attention deficit disorcer


Notice we get some weird results towards the end if you print **all** of them (lots of a's being converted to at's, but we can ignore that for now. If we need to remove stop words later, we can. 

### Making the Vector DataFrames
Appending to a dataframe increases time to copy data linearly because df.append copies the entire object. The following will take an article's text, break it into sentences, and vectorize each sentence (using scispacy's pre-trained word2vec model). Finally, the list of dicts is loaded as a DataFrame and saved.

So here's the real meat of our pre-processing. This is really heavy because it processes line-by-line and then generates a lot of metadata (entities, vectors). We can break it into pieces later depending on the task we want to use this information for, but querying lines is a lot more useful that querying whole documents when you want to know about something specific like seroconversion, spike proteins, or something else. Once you identify lines of interest, you can generate more data about the actual document, since each line will be indexed with document, start and end character, entities, vectors, and language. 

#### Lemmatized Text

Just in case we need it, let's do some text cleaning and include that in a different column. Lemmatization normalizes data so that when you're creating word clouds or simplified TF-IDF, the number of dimesions you're dealing with are significantly reduced. It's also nice to remove words that don't contribute much meaning, but do note that removing stop-words will make neural models less accurate depending on the task you're using them for.


In [9]:
def df_cleaner(df):
    df.fillna("Empty", inplace=True) # If we leave floats (NaN), spaCy will break.
    for i in df.index:
        for j in range(len(df.columns)):
            if " q q" in df.iloc[i,j]:
                df.iloc[i,j] = df.iloc[i,j].replace(" q q","") # Some articles are filled with " q q q q q q q q q"

# Convenience method for lemmatizing text. This will remove punctuation that isn't part of
# a word. 
def lemmatize_my_text(doc):
    lemma_column = []
    for i in df.index:
        if df.iloc[i]["language"] == "en":
            doc = nlp(str(df.iloc[i][column]), disable=["ner","linker", "language_detector"])
            lemmatized_doc = " ".join([token.lemma_ for token in doc])
            lemma_column.append(lemmatized_doc)
        else: 
            lemma_column.append("Non-English")
    return lemma_column

#Unnabreviate text. This should be done BEFORE lemmatiztion and vectorization. 
def unnabreviate_my_text(doc):
    if len(doc._.abbreviations) > 0 and doc._.language["language"] == "en":
        doc._.abbreviations.sort()
        join_list = []
        start = 0
        for abbrev in doc._.abbreviations:
            join_list.append(str(doc.text[start:abbrev.start_char]))
            if len(abbrev._.long_form) > 5: #Increase length so "a" and "an" don't get un-abbreviated
                join_list.append(str(abbrev._.long_form))
            else:
                join_list.append(str(doc.text[abbrev.start_char:abbrev.end_char]))
            start = abbrev.end_char
        # Reassign fixed body text to article in df.
        new_text = "".join(join_list)
        # We have new text. Re-nlp the doc for futher processing!
        doc = nlp(new_text)
        return(doc)
    
def pipeline(df, column, dataType, filename):
    
    languages = []
    start_chars = []
    end_chars = []
    entities = []
    sentences = []
    vectors = []
    _ids = []
    columns = []
    lemmas = []
        
    for i in tqdm(df.index):
        doc = nlp(str(df.iloc[i][column]))
                  
        #doc = unnabreviate_my_text(doc)

        if doc._.language["language"] == "en" and len(doc.text) > 5:
            for sent in tqdm(doc.sents):
                languages.append(doc._.language["language"])
                sentences.append(sent.text)
                vectors.append(sent.vector)
                start_chars.append(sent.start_char)
                end_chars.append(sent.end_char)
                doc_ents = []
                for ent in sent.ents: 
                    if len(ent._.umls_ents) > 0:
                        poss = linker.umls.cui_to_entity[ent._.umls_ents[0][0]].canonical_name
                        doc_ents.append(poss)
                entities.append(doc_ents)
                _ids.append(df.iloc[i,0])
                if dataType == "tables":
                    columns.append(df.iloc[i]["figure"])
                elif dataType == "text":
                    columns.append(column)
                lemmas.append([token.lemma_ for token in sent])
        else: 
            start_chars.append(0)
            end_chars.append(len(doc.text))
            entities.append("Non-English")
            sentences.append(doc.text)
            vectors.append(np.zeros(200))
            _ids.append(df.iloc[i,0])
            languages.append(doc._.language["language"])
            if dataType == "tables":
                columns.append(df.iloc[i]["figure"])
            elif dataType == "text":
                columns.append(column)
            lemmas.append("Non-English")
            
    df = pd.DataFrame(data={"_id": _ids, "language": languages, "section": columns, "sentence": sentences, 
            "startChar": start_chars, "endChar": end_chars, "entities": entities, "lemma": lemmas, "w2vVector":vectors})
    df.to_csv(filename, index=False)

In [11]:
df = pd.read_csv("coronawhy/covid_full_unabbreviated-method date of datetime.datetime objects.csv")
df.drop(columns=["citations","title","abstract"], inplace=True)
df_list = np.array_split(df, 1000)
os.mkdir("coronawhy/df_parts")
for i in range(len(df_list)):
    df_list[i].to_csv(f"./coronawhy/df_parts/{i}.csv", index=False)

In [14]:
for i in tqdm(os.listdir("coronawhy/df_parts")):
    f = "coronawhy/df_parts/" + i
    df = pd.read_csv(f)
    pipeline(df=df, column="text", dataType="text", filename=f.strip(".csv")+"_processed.csv")

HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=29.0), HTML(value='')))

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=30.0), HTML(value='')))

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=30.0), HTML(value='')))

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=29.0), HTML(value='')))

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=29.0), HTML(value='')))

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))





HBox(children=(FloatProgress(value=0.0, max=29.0), HTML(value='')))

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

KeyboardInterrupt: 