<a href="https://colab.research.google.com/github/grojasc/Proyecto_aplicado_I_2022/blob/main/Streamlit_procesamiento_texto.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MIA UC (Magíster en Inteligencia Artificial) - curso Proyecto Aplicado I
## Práctico Streamlit Procesamiento de Texto

**Docente:** Manuel Cartagena

**Ayudantes:** Nicolás Sumonte - Álvaro Labarca

**Jueves 17 de Noviembre de 2022**

En esta actividad utilizaremos [**gensim**](https://https://radimrehurek.com/gensim/) y [**sumy**](https://https://miso-belica.github.io/sumy/) para funciones de NLP, **Streamlit** para hacer un prototipo de la aplicación y **NGROK** para llevarlo a producción. 

In [None]:
!pip install streamlit==1.14.0
!pip install gensim 
!pip install sumy

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting streamlit==1.14.0
  Downloading streamlit-1.14.0-py2.py3-none-any.whl (9.2 MB)
[K     |████████████████████████████████| 9.2 MB 8.8 MB/s 
[?25hCollecting rich>=10.11.0
  Downloading rich-12.6.0-py3-none-any.whl (237 kB)
[K     |████████████████████████████████| 237 kB 40.3 MB/s 
Collecting semver
  Downloading semver-2.13.0-py2.py3-none-any.whl (12 kB)
Collecting validators>=0.2
  Downloading validators-0.20.0.tar.gz (30 kB)
Collecting gitpython!=3.1.19
  Downloading GitPython-3.1.29-py3-none-any.whl (182 kB)
[K     |████████████████████████████████| 182 kB 26.7 MB/s 
Collecting watchdog
  Downloading watchdog-2.1.9-py3-none-manylinux2014_x86_64.whl (78 kB)
[K     |████████████████████████████████| 78 kB 5.7 MB/s 
[?25hCollecting pympler>=0.9
  Downloading Pympler-1.0.1-py3-none-any.whl (164 kB)
[K     |████████████████████████████████| 164 kB 67.3 MB/s 
[?25hCollecting

Ahora, construiremos el codigo de nuestra aplicación, es por lo mismo que escribimos el archivo app.py, en la cual, debemos insertar todos los componentes que queremos que streamlit interprete y despliegue en la aplicación

Esta celda contiene la aplicación (`app.py`), cualquier modificación volver a ejecutar esta celda para guardar nueva versión de la app. 

In [None]:
%%writefile app.py 

import streamlit as st 
import os
from textblob import TextBlob 
import spacy
from gensim.summarization import summarize

# paquete para text summarization 
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lex_rank import LexRankSummarizer

import nltk
nltk.download('punkt')


# funcion para summarization 
def sumy_summarizer(docx):
	parser = PlaintextParser.from_string(docx,Tokenizer("english"))
	lex_summarizer = LexRankSummarizer()
	summary = lex_summarizer(parser.document,3)
	summary_list = [str(sentence) for sentence in summary]
	result = ' '.join(summary_list)
	return result

# funcion para obtener tokens 
@st.cache
def text_analyzer(my_text):
	nlp = spacy.load('en_core_web_sm')
	docx = nlp(my_text)
	# tokens = [ token.text for token in docx]
	allData = [('"Token":{},\n"Lemma":{}'.format(token.text,token.lemma_))for token in docx ]
	return allData

# funcion para extraer entidades
@st.cache
def entity_analyzer(my_text):
	nlp = spacy.load('en_core_web_sm')
	docx = nlp(my_text)
	tokens = [ token.text for token in docx]
	entities = [(entity.text,entity.label_)for entity in docx.ents]
	allData = ['"Token":{},\n"Entities":{}'.format(tokens,entities)]
	return allData

# Titulo 
st.title("Aplicación NLP con Streamlit")
st.subheader("Natural Language Processing On the Go..")
st.markdown("""
    #### Descripción
    + Esta aplicación de NLP permite hacer diversas tareas como 
    Tokenization,NER,Sentiment,Summarization en idioma inglés. 
    """)

# Tokenizer
if st.checkbox("Mostrar Tokens and Lemma"):
  st.subheader("Tokenizar tu Texto")

  message = st.text_area("Ingresa Texto","Escribe aqui ..")
  if st.button("Analizar"):
    nlp_result = text_analyzer(message)
    st.json(nlp_result)

# Entity Extraction
if st.checkbox("Mostrar Entidades "):
  st.subheader("Analizar texto")

  message = st.text_area("Ingresa texto","Escribe aqui ..")

  if st.button("Extract"):
    entity_result = entity_analyzer(message)
    st.json(entity_result)

# Sentiment Analysis
if st.checkbox("Mostrar Sentiment Analysis"):
  st.subheader("Analizar Texto")

  message = st.text_area("Ingresa Texto","Escribe aqui ..")
  if st.button("Analizar"):
    blob = TextBlob(message)
    result_sentiment = blob.sentiment
    st.success(result_sentiment)

# Summarization
if st.checkbox("Mostrar Text Summarization"):
  st.subheader("Resumir texto")

  message = st.text_area("Ingresa Texto","Escribe aqui ..")
  summary_options = st.selectbox("Escoger Summarizer",['sumy','gensim'])

  if st.button("Summarize"):
    if summary_options == 'sumy':
      st.text("Using Sumy Summarizer ..")
      summary_result = sumy_summarizer(message)

    elif summary_options == 'gensim':
      st.text("Using Gensim Summarizer ..")
      summary_result = summarize(message)
    
    else:
      st.warning("Using Default Summarizer")
      st.text("Using Gensim Summarizer ..")
      summary_result = summarize(message)


    st.success(summary_result)

Overwriting app.py


## Ejecutar aplicación en otra ventana 

Para ejecutar la aplicación, debemos instalar [pyngrok](https://pyngrok.readthedocs.io/en/latest/index.html)

In [None]:
!pip install pyngrok

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
from pyngrok import ngrok

Para que generemos el link, es necesario registrarse en este [link](https://dashboard.ngrok.com/get-started/setup) y en reemplazar TOKEN_NGROK por el codigo mostrado en "Your Authtoken"

In [None]:
# INSERTAR TOKEN PROPIO DE NGROK
!ngrok authtoken 2HdboCJr9kSBtO1pKhLLqISNAAJ_23WbGwDtonaMEuPetY3uT

Authtoken saved to configuration file: /root/.ngrok2/ngrok.yml


In [None]:
public_url = ngrok.connect(port='8501')
public_url

<NgrokTunnel: "http://0404-34-86-72-84.ngrok.io" -> "http://localhost:80">

Finalmente, ejecutamos la aplicación y se nos mostrará el link donde será posible verla en acción

In [None]:
!streamlit run app.py & npx localtunnel --port 8501

[###########.......] / extract:yargs-parser: sill extract yargs-parser@^20.2.2[0m[K
Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
[0m
[K[?25hnpx: installed 22 in 3.204s
[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Network URL: [0m[1mhttp://172.28.0.2:8501[0m
[34m  External URL: [0m[1mhttp://34.86.72.84:8501[0m
[0m
your url is: https://huge-plants-tan-34-86-72-84.loca.lt
2022-11-17 23:01:04.474806: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-11-17 23:01:04.966 'pattern' package not found; tag filters are not available for English
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data..