## Data Cleaning<a id="data-cleaning"></a>
![picture](img/picture7.png)
The notebook will mainly clean the data for EDA and visualisation as the tokenizer generally will prepare the data for training.

### Content
* Cleaning of missing data and Duplicates
* Resetting the index
* Feature Creation length of text
* Using udf for special characters and stop words
* Exploding cleaned text to get individual words
* Final creation of gold table for visualisation and modelling usage
### Objective
* To demonstrate some of the cleaning that a nlp user might need 

In [1]:
%_do_not_call_change_endpoint --username  --password  --server https://lighter-staging.vbrani.aisingapore.net/lighter/api  

### Explanation of the need for PV<a id="dc1"></a>
This project involves hosting the Spark kernel and Jupyter Notebook on Kubernetes, where each has its own working directory and cache. However, this approach can lead to inefficiencies and limitations, as the code would write into the Spark kernel directory but read from the Jupyterlab directory. To resolve this issue, we have created a persistent volume that acts as a shared storage between both deployments. 

![picture](img/picture16.png)

In a Kubernetes cluster, managing storage is a different problem than managing compute instances. To address this, the PersistentVolume subsystem provides an API that abstracts the details of how storage is provided from how it is consumed. The subsystem introduces two new API resources: PersistentVolume and PersistentVolumeClaim.

A PersistentVolume (PV) is a piece of storage in the cluster that is either provisioned by an administrator or dynamically provisioned using Storage Classes. PVs are volume plugins and have a lifecycle independent of any Pod that uses the PV. This API object captures the details of the implementation of the storage, such as NFS, iSCSI, or cloud-provider-specific storage.

A PersistentVolumeClaim (PVC) is a request for storage by a user, similar to a Pod. PVCs consume PV resources, and users can request specific size and access modes. For a deeper understanding, please refer to this [link](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) where the content was referenced from.

### Additional configs for PV in Spark Configuration<a id="dc2"></a>

Thus, in the spark config we have four additional lines 2 for the executors and 2 for the drivers for the PVC.

In [None]:
%%configure -f
{"conf": {
        "spark.sql.warehouse.dir" : "s3a://dataops-example/nlp",
        "spark.hadoop.fs.s3a.access.key":"",
        "spark.hadoop.fs.s3a.secret.key": "",
        "spark.kubernetes.container.image": "justinljg/dep:1.08",
        "spark.kubernetes.container.image.pullPolicy" : "Always",
        "spark.kubernetes.driver.volumes.persistentVolumeClaim.lighter-sparknlptest-pvc.options.claimName": "lighter-sparknlptest-pvc",
        "spark.kubernetes.driver.volumes.persistentVolumeClaim.lighter-sparknlptest-pvc.mount.path": "/opt/spark/work-dir",
        "spark.kubernetes.executor.volumes.persistentVolumeClaim.lighter-sparknlptest-pvc.options.claimName": "lighter-sparknlptest-pvc",
        "spark.kubernetes.executor.volumes.persistentVolumeClaim.lighter-sparknlptest-pvc.mount.path": "/opt/spark/work-dir",
        "spark.jars.packages": "io.delta:delta-core_2.12:2.1.0,za.co.absa.spline.agent.spark:spark-3.0-spline-agent-bundle_2.12:1.1.0",
        "spark.sql.queryExecutionListeners": "za.co.absa.spline.harvester.listener.SplineQueryExecutionListener",
        "spark.spline.producer.url": "http://172.19.152.160:8080/producer"
    },
 "executorMemory": "3G",
 "executorCores": 1,
 "driverMemory": "3G",
 "driverCores": 1
}

### Additional configs for data lineage in Spark

These lines will apply a data lineage monitoring. This uses spline and the UI is available as a ui at http://172.19.152.160:9090/app/events/list.

```
"spark.jars.packages": "io.delta:delta-core_2.12:2.1.0,za.co.absa.spline.agent.spark:spark-3.0-spline-agent-bundle_2.12:1.1.0",
"spark.sql.queryExecutionListeners": "za.co.absa.spline.harvester.listener.SplineQueryExecutionListener",
"spark.spline.producer.url": "http://172.19.152.160:8080/producer"
```

![picture](img/picture21.png)

In [None]:
import re
import time

import nltk
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer, WordNetLemmatizer
from nltk.tokenize.toktok import ToktokTokenizer

import pyspark
from pyspark.sql.functions import col, monotonically_increasing_id, udf
from pyspark.sql.types import ArrayType, StringType

In [4]:
%%sql

USE SparkNLP;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

In [5]:
%%sql

SHOW TABLES;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(children=(HTML(value='Type:'), Button(description='Table', layout=Layout(width='70px'), st…

Output()

In [6]:
df = spark.read.table("greview_table_bronze")
df.show(truncate=False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

+-----+-------------------+--------------------+-------------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----+-------------------------------------------------------------------------------------+-------------------------------------+----+----+----+----+----+----+----+
|index|userid             |name                |time         |rating|text                                                                                                                                                                                                                                                                                                                            |pics|resp                                              

## Cleaning of missing data and Duplicates

This portion of code removes missing data and duplicates. The Distinct selects unique data and the where adds a condition that the columns cannot be NULL.

[Example (DISTINCT)](https://www.w3schools.com/sql/sql_distinct.asp), [Example (WHERE)](https://www.w3schools.com/sql/sql_where.asp)

In [7]:
%%sql

CREATE OR REPLACE TEMP VIEW greview_view_silver AS
SELECT DISTINCT index, userid, time, rating, text, gmap_id 
FROM greview_table_bronze
WHERE index IS NOT NULL
  AND userid IS NOT NULL
  AND time IS NOT NULL
  AND rating IS NOT NULL
  AND text IS NOT NULL
  AND gmap_id IS NOT NULL;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

In [8]:
df = spark.read.table("greview_view_silver")

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [9]:
df.show(truncate=False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

+-----+-------------------+-------------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------+
|index|userid             |time         |rating|text                                                                                                                     

This line of code counts for any null values for the columns specified.

[Example (COUNT)](https://www.w3schools.com/sql/sql_count_avg_sum.asp)

In [10]:
%%sql

SELECT COUNT(*) FROM greview_view_silver
WHERE index IS NULL
OR userid IS NULL
OR time IS NULL
OR rating IS NULL
OR text IS NULL
OR gmap_id IS NULL;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(children=(HTML(value='Type:'), Button(description='Table', layout=Layout(width='70px'), st…

Output()

In [11]:
%%sql

SELECT count(*) FROM greview_view_silver;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(children=(HTML(value='Type:'), Button(description='Table', layout=Layout(width='70px'), st…

Output()

# Resetting the index

As seen earlier, the index column is affected as it selects data from specific index so the index has jumps like (e.g. 371,1081,1082), this resets the index. This is done using a function from pyspark.sql.functions called monotonically_increasing_id.


[Documentation (.withColumn())](https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.DataFrame.withColumn.html), [Documentation (monotonically_increasing_id)](https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.functions.monotonically_increasing_id.html#:~:text=A%20column%20that%20generates%20monotonically,and%20unique%2C%20but%20not%20consecutive.)

In [12]:
df = spark.read.table("greview_view_silver")
# add a new column "index" using monotonically_increasing_id function
df = df.withColumn("cleaned_index", monotonically_increasing_id())

# drop the original index column if needed
df = df.drop("index")

df.createOrReplaceTempView("greview_view_silver")

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [13]:
df.show(truncate=False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

+-------------------+-------------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------+-------------+
|userid             |time         |rating|text                                                                                                                   

Create a new silver table with the silver view.

In [14]:
%%sql

CREATE TABLE IF NOT EXISTS greview_table_silver
AS SELECT * FROM greview_view_silver;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

In [15]:
%%sql
    
SHOW TABLES;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(children=(HTML(value='Type:'), Button(description='Table', layout=Layout(width='70px'), st…

Output()

## Feature Creation length of text

This portion creates a new column for the number of characters in the text column.

The sql query LOWER() gives the lower cases the text, LENGTH() give the number of characters in the number of characters in the text.

[Example (LOWER)](https://www.w3schools.com/sql/func_sqlserver_lower.asp), [Example (LENGTH)](https://www.w3schools.com/sql/func_mysql_length.asp)

In [16]:
%%sql

CREATE OR REPLACE TEMP VIEW greview_view_gold AS 
SELECT cleaned_index AS index, text, LOWER(text) AS Cleaned_text, userid, gmap_id, time, rating, length(text) AS text_length 
FROM greview_table_silver;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

Read the view and display.

In [17]:
df = spark.read.table("greview_view_gold")
df.show(truncate=False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

+-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-------------------------------------+-------------+------+-----------+
|index|text                                                                                                                    

## Using udf for special characters, stop words and stemming

This portion removes the special characters using regex and the udf to apply it to a spark.sql dataframe.

<br>

#### Regex for removing special characters
The regular expression ```r"[^a-zA-Z0-9\s]+" ```matches one or more consecutive characters that are not alphanumeric (a-zA-Z0-9) or whitespace (\s).

Here's a breakdown of the individual components of the regular expression:

* ```[^a-zA-Z0-9\s]```: This is a character class that matches any character that is not an alphanumeric character or whitespace. The ^ at the beginning of the character class negates it, meaning it matches any character that is not in the character class.

* ```+```: This is a quantifier that matches one or more occurrences of the preceding pattern. In this case, it matches one or more occurrences of the character class [^a-zA-Z0-9\s].

* ```r```: This is a raw string prefix in Python that indicates that backslashes should be treated as literal backslashes, rather than escape characters.

<br>

#### Pyspark udf function
The pyspark.sql.functions.udf() method creates a user-defined function (UDF) that can be used with PySpark's DataFrame API. This allows you to apply your own custom functions to the data in your DataFrames.

<b>pyspark.sql.functions.udf(f=None, returnType=StringType)</b>

The method takes two parameters:

f: This is the Python function that you want to use as your UDF. This function can take any number of arguments, but it should return a single value. When you call the UDF on a DataFrame column, each value in the column will be passed as an argument to this function.

returnType: This parameter specifies the return type of the UDF. You can either pass a pyspark.sql.types.DataType object that specifies the type directly, or you can pass a string that represents the type in DDL (Data Definition Language) format.

[Documentation (udf)](https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.functions.udf.html)

In [18]:
def remove_sc(text)->str:
    """
    Removes all special characters from text
    Arg : Text
    Outputs : Text without special characters
    """
    # Define a regular expression to match special characters
    regex = r"[^a-zA-Z0-9\s]+"
    
    # Use the sub() method to replace special characters with an empty string
    cleaned_text = re.sub(regex, "", text)
    
    return cleaned_text

removescUDF = udf(remove_sc, StringType())

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [19]:
df = df.withColumn("Cleaned_text", removescUDF(col("Cleaned_text")))

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [20]:
df.show(truncate=False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

+-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-------------------------------------+-------------+------+-----------+
|index|text                                                                                                                                   

#### Downloading resources for NLTK libraries
Next, the nltk library is utilised to remove the stopwords. The punkt resource provides the necessary data for tokenizing text into individual words. First, it has to be downloaded to the the PV. Then, we have to append the path to read the PV.

[Documentation (download)](https://www.nltk.org/data.html), [Documentation (data.path)](https://www.nltk.org/api/nltk.data.html)

In [21]:
nltk.download('stopwords',download_dir ="/opt/spark/work-dir")
nltk.download('punkt', download_dir ="/opt/spark/work-dir")
nltk.data.path.append('/opt/spark/work-dir')

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [22]:
#Setting English stopwords
stopword_list=nltk.corpus.stopwords.words('english')

#set stopwords to english
stop=set(stopwords.words('english'))

stem = PorterStemmer()
lem = WordNetLemmatizer()

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

#### UDF for removing stopwords
It defines a function called remove_stopwords that takes a text string as input and removes all the stopwords from it using the set of stopwords created earlier. The function tokenizes the input text using the tokenizer, removes any extra whitespace from each token, and then removes stopwords from the tokenized text based on whether the token is lowercase or not. If is_lower_case is True, the function removes only exact matches with stopwords. If it is False, the function also removes the stopwords even if they are in uppercase letters. The filtered tokens are then joined back into a string separated by spaces and returned as the output of the function. Although the text has been lowered, the function retains the if function for lower_case as a fail safe.

In [None]:
def remove_stopwords(text: str, is_lower_case: bool = True) -> str:
    """
    Removes stopwords from the given text.

    Args:
        text (str): The input text.
        is_lower_case (bool): If True, removes stopwords regardless of their case.
                              If False, removes stopwords only if they are in lowercase.
                              Defaults to True.

    Returns:
        str: The text without stopwords.
    """
    # Set the list of English stopwords
    stopword_list = set(stopwords.words('english'))

    # Tokenize the input text
    tokenizer = ToktokTokenizer()
    tokens = tokenizer.tokenize(text)

    # Remove whitespace from each token
    tokens = [token.strip() for token in tokens]

    # Filter out stopwords based on their case
    if is_lower_case:
        filtered_tokens = [token for token in tokens if token not in stopword_list]
    else:
        filtered_tokens = [token for token in tokens if token.lower() not in stopword_list]

    # Join the filtered tokens back into a string
    filtered_text = ' '.join(filtered_tokens)

    return filtered_text

remove_stopwordsUDF = udf(remove_stopwords, StringType())

Finally, the remove_stopwords function is wrapped as a user-defined function (UDF) called remove_stopwordsUDF using the udf() method from the pyspark.sql.functions module, with a return type of StringType().

[Example (udf)](https://sparkbyexamples.com/pyspark/pyspark-udf-user-defined-function/)

In [24]:
df = df.withColumn("Cleaned_text", remove_stopwordsUDF(col("Cleaned_text")))

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [25]:
df.show(truncate=False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

+-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-------------------------------------+-------------+------+-----------+
|index|text                                                                                                                                                                                                                                                                           

#### Processing words for topic modelling
Next, processed text is needed for topic modelling later on. The general procedure is the same as the removal of stopwords. The udf however is defined in another way that is more concised. The words are tokenized, stopwords are removed and then the words are stemmed.

In [26]:
@udf(returnType=ArrayType(StringType()))
def preprocess_text(text):
    words = nltk.word_tokenize(text.lower())
    words = [w for w in words if w not in stop and len(w) > 2]
    words = [stem.stem(w) for w in words]
    return words

df = df.withColumn("preprocessed_text", preprocess_text(df.text))


FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [27]:
df.show(truncate=False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

+-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-------------------------------------+-------------+------+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|index|text                

In [28]:
df.createOrReplaceTempView("greview_view_gold")

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

# Exploding cleaned text to get individual words

This line of code spilts the text into a list and the function explodes takes the individual words of the list and appends it as the column word in the new view.

[Example (split), ](https://www.w3schools.com/sql/func_msaccess_split.asp)[Example (explode)](https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.functions.explode.html)

In [29]:
%%sql

CREATE OR REPLACE TEMP VIEW greview_view_gold2 AS
SELECT explode(split(Cleaned_text, ' ')) as word
FROM greview_view_gold;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

## Final creation of gold table for visualisation and modelling usage

This portion creates the views and tables needed for modelling and visualisation using sql queries which will not be explained in detail.

In [30]:
%%sql

CREATE TABLE IF NOT EXISTS greview_viz
AS SELECT * FROM greview_view_gold;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

In [31]:
df = spark.read.table("greview_viz")
df.show(truncate=False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

+-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-------------------------------------+-------------+------+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|index|text                

In [32]:
%%sql

drop table greview_viz_word

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

In [33]:
%%sql

CREATE TABLE IF NOT EXISTS greview_viz_word
AS SELECT * FROM greview_view_gold2;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

In [34]:
%%sql

SELECT COUNT(*) FROM greview_viz_word;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(children=(HTML(value='Type:'), Button(description='Table', layout=Layout(width='70px'), st…

Output()

Create the table for modelling.

In [35]:
%%sql

drop table greview_model

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

In [36]:
%%sql

CREATE TABLE IF NOT EXISTS greview_model AS
SELECT text,rating AS label FROM greview_view_gold;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(), EncodingWidget(children=(VBox(children=(HTML(value='Encoding:'), Dropdown(description='…

Output()

In [37]:
df = spark.read.table("greview_model")
df.show(truncate=False)

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----+
|text                                                                                                                                                                                                                                                                                                                                                                                             |label|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [38]:
df.write.format("delta") \
    .mode("overwrite") \
    .save("s3a://dataops-example/justin/nlp/greview_model")

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

In [39]:
%%sql
    
SHOW TABLES;

FloatProgress(value=0.0, bar_style='info', description='Progress:', layout=Layout(height='25px', width='50%'),…

VBox(children=(HBox(children=(HTML(value='Type:'), Button(description='Table', layout=Layout(width='70px'), st…

Output()