# Notebooks Analysis Results

### Imports
* "os" per la gestione dei file,
* "pandas" per la creazione del dataframe con i risultati di analisi
* "pynblint", il modulo contenente le operazionalizzazioni delle best practice
* "config" per la dinamicità dei percorsi di progetto
* "entities" per modellare i soggetti dello studio (Notebooks, ecc.)

In [1]:
import os
import pandas

from pathlib import Path

import config
import pynblint

from entities import Notebook, GitHubRepository, LocalRepository

### Processing

Ogni target notebook viene analizzato applicando le funzioni per l'analisi della qualità che rispecchiano le best practice individuate

In [2]:
data=[]
for filename in os.listdir(config.data_path):
       if filename.endswith(".ipynb"):
              notebook_path = Path(config.data_path) / filename
              notebook = Notebook(notebook_path)
              data.append(notebook.get_pynblint_results())

### Display results

Tutte le osservazioni riguardanti un notebook sono contenute in un'unica tupla, il dataframe viene dunque creato come una lista di tuple, assegnando i corrispondenti nomi alle osservazioni sulle colonne. Ogni riga è un target notebook.

In [3]:
df = pandas.DataFrame(data, columns =["Notebook Name","Correc Execution Order","Classes","Functions","Imports Correct Position","Markdown Rows","Markdown Titles","Bottom md percentage","Non-executed Cells","Empty Cells","Bottom Non-executed Cells","Bottom Empty Cells","Total cells","MD cells","Code cells","Raw cells"])
df

Unnamed: 0,Notebook Name,Correc Execution Order,Classes,Functions,Imports Correct Position,Markdown Rows,Markdown Titles,Bottom md percentage,Non-executed Cells,Empty Cells,Bottom Non-executed Cells,Bottom Empty Cells,Total cells,MD cells,Code cells,Raw cells
0,my-attempt-at-analytics-vidhya-job-a-thon.ipynb,True,0,3,False,7,2,0.0,0,3,0,2,78,4,74,0


## Analysis of a github repository

In [4]:
repo = GitHubRepository('https://github.com/collab-uniba/Sentiment_Analysis_4SE_BERT')

In [6]:
data=[]
for notebook in repo.notebooks:
    data.append(notebook.get_pynblint_results())
    
cols = [
    "Notebook Name", 
    "Correc Execution Order",
    "Classes","Functions",
    "Imports Correct Position",
    "Markdown Rows",
    "Markdown Titles",
    "Bottom md percentage",
    "Non-executed Cells",
    "Empty Cells",
    "Bottom Non-executed Cells",
    "Bottom Empty Cells",
    "Total cells",
    "MD cells",
    "Code cells",
    "Raw cells"
]
    
dataframe = pandas.DataFrame(data, columns=cols)
dataframe

Unnamed: 0,Notebook Name,Correc Execution Order,Classes,Functions,Imports Correct Position,Markdown Rows,Markdown Titles,Bottom md percentage,Non-executed Cells,Empty Cells,Bottom Non-executed Cells,Bottom Empty Cells,Total cells,MD cells,Code cells,Raw cells
0,Sentiment_Analysis_4SE_BERT/notebooks/stackove...,True,0,0,False,1,0,0.0,0,0,0,0,18,1,17,0
1,Sentiment_Analysis_4SE_BERT/notebooks/github-w...,True,0,0,False,0,0,0.0,0,0,0,0,17,0,17,0
2,Sentiment_Analysis_4SE_BERT/notebooks/cross-pl...,True,0,0,False,0,0,0.0,0,0,0,0,13,0,13,0
3,Sentiment_Analysis_4SE_BERT/notebooks/jira-w-b...,True,0,0,False,0,0,0.0,0,0,0,0,17,0,17,0
