# UFABC PDF to Spreadsheet

Define the ```url``` and ```file_name``` string variables that contain the URL of the PDF file to be converted and the new files name, repectively.

In [1]:
url = "http://prograd.ufabc.edu.br/pdf/turmas_salas_docentes_sa_2018.1.pdf"
file_name = "2018.1_SA" # without extention

Uncomment the proper lines if it is desired to delete the specified files (*don't worry, the files should be downloaded again, if possible, in the next cell*)

In [2]:
# PDF, CSV, and JSON files
# !rm $file_name".pdf"
# !rm $file_name".csv"
# !rm $file_name".json"

# tabula-java library
# !rm tabula.jar

The PDF file is downloaded, renamed, and converted to a CSV file.

In [3]:
from pathlib import Path
from IPython.display import FileLink
from IPython.display import IFrame

if not Path(file_name + ".pdf").is_file():
    !echo "Downloading the PDF file..."
    !wget $url -O $file_name".pdf"
    file_pdf = FileLink(file_name + '.pdf')
    !echo PDF file saved as:
else:
    file_pdf = FileLink(file_name + '.pdf')
    !echo PDF file already exists:
display(file_pdf)

if not Path("tabula.jar").is_file():
    !echo "Downloading the tabula-java PDF converting library..."
    !wget https://github.com/tabulapdf/tabula-java/releases/download/v1.0.1/tabula-1.0.1-jar-with-dependencies.jar -O tabula.jar
    file_tabula = FileLink('tabula.jar')
    !echo Library downloaded as:
else:
    file_tabula = FileLink('tabula.jar')
    !echo tabula-java library file already exists:
display(file_tabula)

if not Path(file_name + ".csv").is_file():
    !echo "Converting the PDF file (this might take a while)..."
    !java -Dfile.encoding=utf-8 -jar tabula.jar -l --pages 3 $file_name".pdf" -o $file_name".csv"
    !echo Done!
    file_csv = FileLink(file_name + '.csv')
    !echo CSV file saved as:
else:
    file_csv = FileLink(file_name + '.csv')
    !echo CSV file already exists:
display(file_csv)

Downloading the PDF file...
--2018-02-17 15:43:48--  http://prograd.ufabc.edu.br/pdf/turmas_salas_docentes_sa_2018.1.pdf
Resolving prograd.ufabc.edu.br (prograd.ufabc.edu.br)... 200.133.215.63
Connecting to prograd.ufabc.edu.br (prograd.ufabc.edu.br)|200.133.215.63|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 689670 (674K) [application/pdf]
Saving to: ‘2018.1_SA.pdf’


2018-02-17 15:43:52 (265 KB/s) - ‘2018.1_SA.pdf’ saved [689670/689670]

PDF file saved as:


Downloading the tabula-java PDF converting library...
--2018-02-17 15:43:54--  https://github.com/tabulapdf/tabula-java/releases/download/v1.0.1/tabula-1.0.1-jar-with-dependencies.jar
Resolving github.com (github.com)... 192.30.255.113, 192.30.255.112
Connecting to github.com (github.com)|192.30.255.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github-production-release-asset-2e65be.s3.amazonaws.com/20046106/80e33368-7aba-11e7-874c-17aa8674f120?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20180217%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20180217T154355Z&X-Amz-Expires=300&X-Amz-Signature=9a0f1d8ef310881a49ee9ef8f1ddb1a71b7c066460e286947230bc5955f32a5b&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment%3B%20filename%3Dtabula-1.0.1-jar-with-dependencies.jar&response-content-type=application%2Foctet-stream [following]
--2018-02-17 15:43:55--  https://github-production-release-asset-2e65be.s

CSV file already exists:


PDF preview

In [4]:
IFrame(file_name + '.pdf', width=600, height=300)

Install the Natural Language Toolkit (NTLK), the Portuguese treebank

In [5]:
!pip install nltk

Collecting nltk
  Downloading nltk-3.2.5.tar.gz (1.2MB)
[K    100% |████████████████████████████████| 1.2MB 969kB/s ta 0:00:01
Building wheels for collected packages: nltk
  Running setup.py bdist_wheel for nltk ... [?25ldone
[?25h  Stored in directory: /home/jovyan/.cache/pip/wheels/18/9c/1f/276bc3f421614062468cb1c9d695e6086d0c73d67ea363c501
Successfully built nltk
Installing collected packages: nltk
Successfully installed nltk-3.2.5


Configure and simplify it

In [6]:
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('floresta')
from nltk import tokenize
from nltk.corpus import floresta
def simplify_tag(t):
    if "+" in t:
        return t[t.index("+")+1:]
    else:
        return t
twords = nltk.corpus.floresta.tagged_words()
twords = [(w.lower(),simplify_tag(t)) for (w,t) in twords]

# Insert some missing prepositions
twords.insert(0,('da','prp'))
twords.insert(0,('de','prp'))
twords.insert(0,('di','prp'))
twords.insert(0,('do','prp'))
twords.insert(0,('du','prp'))

[nltk_data] Downloading package punkt to /home/jovyan/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/jovyan/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package floresta to /home/jovyan/nltk_data...
[nltk_data]   Unzipping corpora/floresta.zip.


With NTLK propely prepared, create the ```title_pos_tag``` function that imitates ```title``` built-in function but doesn't capitalize conjunctions and prepositions

In [7]:
def title_pos_tag(text):
    def pos_tag_portuguese(tokens):
        for index in range(len(tokens)):
            for word in twords:
                token = tokens[index].lower()
                if word[0] == token:
                    tag = word[1]
                    tokens[index] = (token, tag)
                    break
        return tokens
    tokens = tokenize.word_tokenize(text, language='portuguese')
    tagged = pos_tag_portuguese(tokens)
    new_text = ''
    for index in range(len(tagged)):
        token = tagged[index]
        if isinstance(token, tuple):
            word = token[0]
            tag  = token[1]
            # n:         substantivo
            # prop:      nome próprio
            # art:       artigo
            # pron:      pronome
            # pron-pers: pronome pessoal
            # pron-det:  pronome determinativo
            # pron-indp: substantivo/pron-indp
            # adj:       adjetivo
            # n-adj:     substantivo/adjetivo
            # v:         verbo
            # v-fin:     verbo finitivo
            # v-inf:     verbo infinitivo
            # v-pcp:     verbo particípio
            # v-ger:     verbo gerúndio
            # num:       numeral
            # prp:       preposição
            # adj:       adjetivo
            # conj:      conjunção
            # conj-s:    conjunção subordinativa
            # conj-c:    conjunção coordenativa
            # intj:      interjeição
            # adv:       advérbio
            # xxx:       outro
            if 'conj' in tag or \
               'prp'  in tag:
                new_text = new_text + ' ' + word.lower()
            else:
                new_text = new_text + ' ' + word.capitalize()
        else:
            new_text = new_text + ' ' + token.capitalize()
    new_text = new_text.strip()
#     return (new_text, tagged)
    return new_text

Create a function that shows expandable JSON files

In [8]:
import uuid
from IPython.display import display_javascript, display_html, display
import json

class RenderJSON(object):
    def __init__(self, json_data):
        if isinstance(json_data, dict):
            self.json_str = json.dumps(json_data)
        else:
            self.json_str = json_data
        self.uuid = str(uuid.uuid4())

    def _ipython_display_(self):
        display_html('<div id="{}" style="height: 600px; width:100%;"></div>'.format(self.uuid), raw=True)
        display_javascript("""
        require(["https://rawgit.com/caldwell/renderjson/master/renderjson.js"], function() {
        document.getElementById('%s').appendChild(renderjson(%s))
        });
        """ % (self.uuid, self.json_str), raw=True)

The CSV file is then processed into a JSON file

In [9]:
import csv
with open(file_name + '.csv', encoding="utf-8") as csv_file:
    full_data = []
    content = csv.reader(csv_file, delimiter=',', quotechar='"')
    week_names = ('segunda','terça','quarta','quinta','sexta','sábado','domingo')
    index = -1
    for row in content:
        index = index + 1
        if index:
#             print(', '.join(row).replace('\r',''))
#             print()
            column = 0
            for cell in row:
                column = column + 1
                data = cell.replace('\r','').replace('\n',' ').replace(' , ',', ').strip()
                if   data == '¬': data = ''
                elif data == '0': data = ''

                # Código
                if column == 1:
                    codigo = data.upper()

                # Disciplina - turma
                elif column == 2:
                    # Campus
                    data, _, campus = data.rpartition('(')
                    campus = title_pos_tag(campus[:-1])

                    # Disciplina
                    disciplina, _, data = data.strip().rpartition(' ')
                    disciplina = title_pos_tag(disciplina)

                    # Turma e período
                    turma, _, periodo = data.strip().rpartition('-')
                    turma   = turma.upper()
                    periodo = periodo.capitalize()
                    
                    # Subcódigo
                    subcodigo, _, _ = codigo.partition('-')
                    subcodigo = subcodigo[len(turma)+1:]


                # Teoria
                elif column == 3:
                    for week in week_names:
                        data = data.replace(week, '\n' + week)
                    teoria = data.replace(', \n','\n').strip().splitlines()
                    
                    teoria_num_of_days = len(teoria)
                    teoria_dia_da_semana = [None]*teoria_num_of_days
                    teoria_entrada       = [None]*teoria_num_of_days
                    teoria_saida         = [None]*teoria_num_of_days
                    teoria_sala          = [None]*teoria_num_of_days
                    teoria_frequencia    = [None]*teoria_num_of_days
                    for day in range(teoria_num_of_days):
                        data = teoria[day]
                        teoria_dia_da_semana[day], _, data                   = data.partition(' das ')
                        teoria_entrada[day],       _, data                   = data.partition(' às ')
                        teoria_saida[day],         _, data                   = data.partition(', sala ')
                        teoria_sala[day],          _, teoria_frequencia[day] = data.partition(', ')
                        
                        teoria_dia_da_semana[day] = teoria_dia_da_semana[day].capitalize()
                        teoria_frequencia[day]    = teoria_frequencia[day].capitalize()
                        teoria_sala[day]          = teoria_sala[day].upper()

                # Prática
                elif column == 4:
                    for week in week_names:
                        data = data.replace(week, '\n' + week)
                    pratica = data.replace(',\n','\n').strip().splitlines()
                    
                    pratica_num_of_days = len(pratica)
                    pratica_dia_da_semana = [None]*pratica_num_of_days
                    pratica_entrada       = [None]*pratica_num_of_days
                    pratica_saida         = [None]*pratica_num_of_days
                    pratica_sala          = [None]*pratica_num_of_days
                    pratica_frequencia    = [None]*pratica_num_of_days
                    for day in range(pratica_num_of_days):
                        data = pratica[day]
                        pratica_dia_da_semana[day], _, data                   = data.partition(' das ')
                        pratica_entrada[day],       _, data                   = data.partition(' às ')
                        pratica_saida[day],         _, data                   = data.partition(', sala ')
                        pratica_sala[day],          _, pratica_frequencia[day] = data.partition(', ')
                        
                        pratica_dia_da_semana[day] = pratica_dia_da_semana[day].capitalize()
                        pratica_frequencia[day]    = pratica_frequencia[day].capitalize()
                        pratica_sala[day]          = pratica_sala[day].upper()

                # Docente teoria
                elif column == 5:
                    docente_teoria = title_pos_tag(data)

                # Docente prática
                elif column == 6:
                    docente_pratica = title_pos_tag(data)

            teoria = []
            i = 0
            for day in range(teoria_num_of_days):
                i = i + 1
                teoria_new = {'id': i,
                              'dia_da_semana': teoria_dia_da_semana[day],
                              'horario_de_entrada': teoria_entrada[day],
                              'horario_de_saida': teoria_saida[day],
                              'sala': teoria_sala[day],
                              'frequencia': teoria_frequencia[day]}
                teoria.append(teoria_new)
                
            pratica = []
            i = -1
            for day in range(pratica_num_of_days):
                i = i + 1
                pratica_new = {'id': i,
                               'dia_da_semana': pratica_dia_da_semana[day],
                               'horario_de_entrada': pratica_entrada[day],
                               'horario_de_saida': pratica_saida[day],
                               'sala': pratica_sala[day],
                               'frequencia': pratica_frequencia[day]}
                pratica.append(pratica_new)
                
            new_data = {'id': index-1,
                        'codigo': codigo,
                        'subcodigo': subcodigo,
                        'disciplina': disciplina,
                        'campus': campus,
                        'periodo': periodo,
                        'turma': turma,
                        'teoria': teoria,
                        'pratica': pratica,
                        'docente_teoria': docente_teoria,
                        'docente_pratica': docente_pratica}
            full_data.append(new_data)
            
#             print("Código:\t\t\t", codigo)
#             print("Subcódigo:\t\t", subcodigo)
#             print("Disciplina:\t\t", disciplina)
#             print("Campus:\t\t\t", campus)
#             print("Período:\t\t", periodo)
#             print("Turma:\t\t\t", turma)
#             print("Teoria:\t\t\t")
#             for day in range(teoria_num_of_days):
#                 print('\t',   teoria_dia_da_semana[day])
#                 print('\t\t', teoria_entrada[day], 'às', teoria_saida[day])
#                 print('\t\t', 'Sala:', teoria_sala[day])
#                 print('\t\t', teoria_frequencia[day])
#                 print()
#             print("Prática:\t\t")
#             for day in range(pratica_num_of_days):
#                 print('\t',   pratica_dia_da_semana[day])
#                 print('\t\t', pratica_entrada[day], 'às', pratica_saida[day])
#                 print('\t\t', 'Sala:', pratica_sala[day])
#                 print('\t\t', pratica_frequencia[day])
#                 print()
#             print("Docente teoria:\t\t", docente_teoria)
#             print("Docente prática:\t", docente_pratica)
#             print()
#             print()
#             print()

#     print(full_data)
    with open(file_name + '.json', 'w') as file:
        import json
        json.dump(full_data, file)
        file_json = FileLink(file_name + '.json')
        !echo JSON file saved as:
        display(file_json)
    with open(file_name + '.json', 'r') as file:
        data = json.load(file)

JSON file saved as:


JSON preview

In [10]:
RenderJSON(data)

Install qgrid DataFrames widget if not installed

In [11]:
# !pip install ipywidgets==6.0.0
# !jupyter nbextension enable --py widgetsnbextension --sys-prefix
# !jupyter labextension install @jupyter-widgets/jupyterlab-manager
# !pip install qgrid

Collecting ipywidgets==6.0.0
  Downloading ipywidgets-6.0.0-py2.py3-none-any.whl (46kB)
[K    100% |████████████████████████████████| 51kB 3.0MB/s ta 0:00:01
Collecting widgetsnbextension~=2.0.0 (from ipywidgets==6.0.0)
  Downloading widgetsnbextension-2.0.1-py2.py3-none-any.whl (1.1MB)
[K    100% |████████████████████████████████| 1.1MB 1.1MB/s eta 0:00:01
Installing collected packages: widgetsnbextension, ipywidgets
  Found existing installation: widgetsnbextension 3.1.3
    Uninstalling widgetsnbextension-3.1.3:
      Successfully uninstalled widgetsnbextension-3.1.3
  Found existing installation: ipywidgets 7.1.1
    Uninstalling ipywidgets-7.1.1:
      Successfully uninstalled ipywidgets-7.1.1
Successfully installed ipywidgets-6.0.0 widgetsnbextension-2.0.1
Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: [32mOK[0m
Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: [32mOK[0m
> /usr/bin/npm pack @jupyter-widgets/jupy

[1G⠁ [0K[1G⠂ @jupyterlab/application@^0.15.4[0K[1G⠄ @jupyterlab/application@^0.15.4[0K[1G⡀ @jupyterlab/application@^0.15.4[0K[1G⢀ @jupyterlab/application@^0.15.4[0K[1G⠠ @jupyterlab/application@^0.15.4[0K[1G⠐ @jupyterlab/application@^0.15.4[0K[1G⠈ @jupyterlab/application@^0.15.4[0K[1G⠁ @jupyterlab/application@^0.15.4[0K[1G⠂ @jupyterlab/application@^0.15.4[0K[1G⠄ @phosphor/widgets@^1.5.0[0K[1G⡀ @phosphor/widgets@^1.5.0[0K[1G⢀ @phosphor/widgets@^1.5.0[0K[1G⠠ @phosphor/widgets@^1.5.0[0K[1G⠐ @phosphor/widgets@^1.5.0[0K[1G⠈ sanitize-html@~1.14.3[0K[1G⠁ sanitize-html@~1.14.3[0K[1G⠂ sanitize-html@~1.14.3[0K[1G⠄ marked@~0.3.9[0K[1G⡀ url-parse@~1.1.9[0K[1G⢀ url-parse@~1.1.9[0K[1G⠠ url-parse@~1.1.9[0K[1G⠐ @phosphor/keyboard@^1.1.2[0K[1G⠈ prop-types@^15.6.0[0K[1G⠁ xtend@^4.0.0[0K[1G⠂ ultron@1.0.x[0K[1G⠄ json-parser@^1.0.0[0K[1G⡀ ua-parser-js@^0.7.9[0K[1G⢀ iconv-lite@~0.4.13[0K[1G⠠ iconv-lite@~0.4.13[0K[1G⠐ whatwg-fetch@>=0.10.0[0K[1G⠈ 

[1G⠂ @jupyterlab/mainmenu-extension@^0.4.4[0K[1G⠄ @jupyterlab/mainmenu-extension@^0.4.4[0K[1G⡀ @jupyterlab/mainmenu-extension@^0.4.4[0K[1G⢀ @jupyterlab/mainmenu-extension@^0.4.4[0K[1G⠠ @jupyterlab/mainmenu-extension@^0.4.4[0K[1G⠐ @jupyterlab/mainmenu-extension@^0.4.4[0K[1G⠈ @jupyterlab/markdownviewer-extension@^0.15.4[0K[1G⠁ @jupyterlab/markdownviewer-extension@^0.15.4[0K[1G⠂ @jupyterlab/markdownviewer-extension@^0.15.4[0K[1G⠄ @jupyterlab/markdownviewer-extension@^0.15.4[0K[1G⡀ @jupyterlab/markdownviewer-extension@^0.15.4[0K[1G⢀ @jupyterlab/mathjax2-extension@^0.3.4[0K[1G⠠ @jupyterlab/mathjax2-extension@^0.3.4[0K[1G⠐ @jupyterlab/mathjax2-extension@^0.3.4[0K[1G⠈ @jupyterlab/mathjax2-extension@^0.3.4[0K[1G⠁ @jupyterlab/mathjax2-extension@^0.3.4[0K[1G⠂ @jupyterlab/mathjax2-extension@^0.3.4[0K[1G⠄ @jupyterlab/mathjax2-extension@^0.3.4[0K[1G⡀ @jupyterlab/notebook-extension@^0.15.4[0K[1G⢀ @jupyterlab/notebook-extension@^0.15.4[0K[1G⠠ @jupyterlab/noteb

[2K[1G[1G[-----------------------------------------------------------------------] 0/759[1G[-----------------------------------------------------------------------] 3/759[1G[#----------------------------------------------------------------------] 6/759[1G[#----------------------------------------------------------------------] 8/759[1G[#---------------------------------------------------------------------] 13/759[1G[##--------------------------------------------------------------------] 17/759[1G[##--------------------------------------------------------------------] 19/759[1G[##--------------------------------------------------------------------] 22/759[1G[##--------------------------------------------------------------------] 25/759[1G[###-------------------------------------------------------------------] 30/759[1G[###-------------------------------------------------------------------] 35/759[1G[####------------------------------------------------------------------] 40

[1G[################################-------------------------------------] 353/759[1G[################################-------------------------------------] 356/759[1G[#################################------------------------------------] 360/759[1G[#################################------------------------------------] 365/759[1G[##################################-----------------------------------] 370/759[1G[##################################-----------------------------------] 372/759[1G[##################################-----------------------------------] 376/759[1G[###################################----------------------------------] 381/759[1G[###################################----------------------------------] 385/759[1G[###################################----------------------------------] 387/759[1G[###################################----------------------------------] 390/759[1G[####################################---------------------------------] 394/759[1G

[1G[#####################################################################] 759/759[2K[1G[2K[1G[1G[-----------------------------------------------------------------------] 0/757[2K[1G[2K[1G[1G[---------------------------------------------------------------------] 0/16592[1G[-------------------------------------------------------------------] 120/16592[1G[#------------------------------------------------------------------] 260/16592[1G[##-----------------------------------------------------------------] 420/16592[1G[##-----------------------------------------------------------------] 560/16592[1G[###----------------------------------------------------------------] 717/16592[1G[###----------------------------------------------------------------] 866/16592[1G[####--------------------------------------------------------------] 1000/16592[1G[#####-------------------------------------------------------------] 1148/16592[1G[#####---------------------------------------------

[1G[#########################################################--------] 14524/16592[1G[#########################################################--------] 14669/16592[1G[##########################################################-------] 14829/16592[1G[###########################################################------] 14997/16592[1G[###########################################################------] 15158/16592[1G[############################################################-----] 15305/16592[1G[#############################################################----] 15465/16592[1G[#############################################################----] 15621/16592[1G[##############################################################---] 15788/16592[1G[##############################################################---] 15949/16592[1G[###############################################################--] 16097/16592[1G[################################################################-] 16250/16592[1G

[1G⠁ [0K[1G[1A⠁ [0K[1G[1B[1G[2A⠁ [0K[1G[2B[1G[3A⠁ [0K[1G[3B[1G[4A⠁ [0K[1G[4B[1G[2m[-/2][22m ⠂ waiting...[0K[1G[1A[2m[-/2][22m ⠂ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠂ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠂ canvas[0K[1G[3B[1G[4A[2m[1/2][22m ⠂ spawn-sync[0K[1G[4B[1G[2m[-/2][22m ⠄ waiting...[0K[1G[1A[2m[-/2][22m ⠄ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠄ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠄ canvas[0K[1G[3B[1G[4A[2m[1/2][22m ⠄ spawn-sync[0K[1G[4B[1G[2m[-/2][22m ⡀ waiting...[0K[1G[1A[2m[-/2][22m ⡀ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⡀ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⡀ canvas[0K[1G[3B[1G[4A[2m[1/2][22m ⡀ spawn-sync[0K[1G[4B[1G[2m[-/2][22m ⢀ waiting...[0K[1G[1A[2m[-/2][22m ⢀ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⢀ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⢀ canvas: gyp info using node@6.11.4 | linux | x64[0K[1G[3B[1G[4A[2m[-/2][22m ⢀ waitin

[1G[2m[-/2][22m ⠄ waiting...[0K[1G[1A[2m[-/2][22m ⠄ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠄ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠄ canvas: CXX(target) Release/obj.target/canvas/src/CanvasGradien[0K[1G[3B[1G[4A[2m[-/2][22m ⠄ waiting...[0K[1G[4B[1G[2m[-/2][22m ⡀ waiting...[0K[1G[1A[2m[-/2][22m ⡀ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⡀ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⡀ canvas: CXX(target) Release/obj.target/canvas/src/CanvasGradien[0K[1G[3B[1G[4A[2m[-/2][22m ⡀ waiting...[0K[1G[4B[1G[2m[-/2][22m ⢀ waiting...[0K[1G[1A[2m[-/2][22m ⢀ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⢀ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⢀ canvas: CXX(target) Release/obj.target/canvas/src/CanvasGradien[0K[1G[3B[1G[4A[2m[-/2][22m ⢀ waiting...[0K[1G[4B[1G[2m[-/2][22m ⠠ waiting...[0K[1G[1A[2m[-/2][22m ⠠ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠠ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠠ canvas: CXX(target

[1G[2m[-/2][22m ⠁ waiting...[0K[1G[1A[2m[-/2][22m ⠁ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠁ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠁ canvas: CXX(target) Release/obj.target/canvas/src/CanvasRenderi[0K[1G[3B[1G[4A[2m[-/2][22m ⠁ waiting...[0K[1G[4B[1G[2m[-/2][22m ⠂ waiting...[0K[1G[1A[2m[-/2][22m ⠂ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠂ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠂ canvas: CXX(target) Release/obj.target/canvas/src/CanvasRenderi[0K[1G[3B[1G[4A[2m[-/2][22m ⠂ waiting...[0K[1G[4B[1G[2m[-/2][22m ⠄ waiting...[0K[1G[1A[2m[-/2][22m ⠄ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠄ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠄ canvas: CXX(target) Release/obj.target/canvas/src/CanvasRenderi[0K[1G[3B[1G[4A[2m[-/2][22m ⠄ waiting...[0K[1G[4B[1G[2m[-/2][22m ⡀ waiting...[0K[1G[1A[2m[-/2][22m ⡀ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⡀ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⡀ canvas: CXX(target

[1G[2m[-/2][22m ⠐ waiting...[0K[1G[1A[2m[-/2][22m ⠐ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠐ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠐ canvas: CXX(target) Release/obj.target/canvas/src/color.o[0K[1G[3B[1G[4A[2m[-/2][22m ⠐ waiting...[0K[1G[4B[1G[2m[-/2][22m ⠈ waiting...[0K[1G[1A[2m[-/2][22m ⠈ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠈ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠈ canvas: CXX(target) Release/obj.target/canvas/src/color.o[0K[1G[3B[1G[4A[2m[-/2][22m ⠈ waiting...[0K[1G[4B[1G[2m[-/2][22m ⠁ waiting...[0K[1G[1A[2m[-/2][22m ⠁ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠁ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠁ canvas: CXX(target) Release/obj.target/canvas/src/color.o[0K[1G[3B[1G[4A[2m[-/2][22m ⠁ waiting...[0K[1G[4B[1G[2m[-/2][22m ⠂ waiting...[0K[1G[1A[2m[-/2][22m ⠂ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠂ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠂ canvas: CXX(target) Release/obj.targ

[1G[2m[-/2][22m ⠠ waiting...[0K[1G[1A[2m[-/2][22m ⠠ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠠ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠠ canvas: CXX(target) Release/obj.target/canvas/src/ImageData.o[0K[1G[3B[1G[4A[2m[-/2][22m ⠠ waiting...[0K[1G[4B[1G[2m[-/2][22m ⠐ waiting...[0K[1G[1A[2m[-/2][22m ⠐ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠐ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠐ canvas: CXX(target) Release/obj.target/canvas/src/ImageData.o[0K[1G[3B[1G[4A[2m[-/2][22m ⠐ waiting...[0K[1G[4B[1G[2m[-/2][22m ⠈ waiting...[0K[1G[1A[2m[-/2][22m ⠈ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠈ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠈ canvas: CXX(target) Release/obj.target/canvas/src/ImageData.o[0K[1G[3B[1G[4A[2m[-/2][22m ⠈ waiting...[0K[1G[4B[1G[2m[-/2][22m ⠁ waiting...[0K[1G[1A[2m[-/2][22m ⠁ waiting...[0K[1G[1B[1G[2A[2m[-/2][22m ⠁ waiting...[0K[1G[2B[1G[3A[2m[2/2][22m ⠁ canvas: CXX(target) Rele

[2K[1GDone in 61.72s.
Collecting ipywidgets>=7.0.0 (from qgrid)
  Downloading ipywidgets-7.1.2-py2.py3-none-any.whl (68kB)
[K    100% |████████████████████████████████| 71kB 2.4MB/s ta 0:00:011
Collecting widgetsnbextension~=3.1.0 (from ipywidgets>=7.0.0->qgrid)
  Downloading widgetsnbextension-3.1.4-py2.py3-none-any.whl (2.2MB)
[K    100% |████████████████████████████████| 2.2MB 554kB/s ta 0:00:01
Installing collected packages: widgetsnbextension, ipywidgets
  Found existing installation: widgetsnbextension 2.0.1
    Uninstalling widgetsnbextension-2.0.1:
      Successfully uninstalled widgetsnbextension-2.0.1
  Found existing installation: ipywidgets 6.0.0
    Uninstalling ipywidgets-6.0.0:
      Successfully uninstalled ipywidgets-6.0.0
Successfully installed ipywidgets-7.1.2 widgetsnbextension-3.1.4


Process the JSON file into a spreadsheet

In [12]:
with open(file_name + '.json', 'r') as file:
    data = json.load(file)
#     print(data[0]['codigo'])
    for disciplina in data:
        print(disciplina['codigo'])

DAESZM035-17SA
NA1MCTB001-17SA
DAMCTB001-17SA
NAMCTB001-17SA
DA1MCTA001-17SA
NA1MCTA001-17SA
DA2MCTA001-17SA
NA2MCTA001-17SA
DB1MCTA001-17SA
DB2MCTA001-17SA
DAMCZA035-14SA


In [13]:
import numpy as np
import pandas as pd
import qgrid
randn = np.random.randn
df_types = pd.DataFrame({
    'A' : pd.Series(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06', '2013-01-07', '2013-01-08', '2013-01-09'],index=list(range(9)),dtype='datetime64[ns]'),
    'B' : pd.Series(randn(9),index=list(range(9)),dtype='float32'),
    'C' : pd.Categorical(["washington", "adams", "washington", "madison", "lincoln","jefferson", "hamilton", "roosevelt", "kennedy"]),
    'D' : ["foo", "bar", "buzz", "bippity","boppity", "foo", "foo", "bar", "zoo"] })
df_types['E'] = df_types['D'] == 'foo'
qgrid_widget = qgrid.QgridWidget(df=df_types, show_toolbar=True)
qgrid_widget

In [14]:
qgrid_widget.get_changed_df()

Unnamed: 0,A,B,C,D,E
0,2013-01-01,-0.271139,washington,foo,True
1,2013-01-02,-2.060626,adams,bar,False
2,2013-01-03,-0.991228,washington,buzz,False
3,2013-01-04,-0.346572,madison,bippity,False
4,2013-01-05,1.408198,lincoln,boppity,False
5,2013-01-06,-0.913605,jefferson,foo,True
6,2013-01-07,-0.612969,hamilton,foo,True
7,2013-01-08,-0.932671,roosevelt,bar,False
8,2013-01-09,-1.958858,kennedy,zoo,False
