# Text AI Preprocessing for the Demo

Here we will run the preprocessing for the the Demo.

## Prerequisites

Prior to using this notebook one needs to complete the following steps:
1. [Configure the AI-Lab](../main_config.ipynb).

## Setup

### Open Secure Configuration Storage

In [1]:
%run ../../utils/access_store_ui.ipynb
display(get_access_store_ui('../../'))

Output()

Box(children=(Box(children=(Label(value='Configuration Store', layout=Layout(border_bottom='solid 1px', border…

## Setup Demo

In [5]:
%run utils/xp_default_extractor.ipynb

In [3]:
from exasol.ai.text.extraction import *
from exasol.ai.text.extraction.extraction import Extraction
from exasol.ai.text.extraction.abstract_extraction import Output

In [4]:
schema=ai_lab_config.db_schema

In [5]:
%run ../../transformers/utils/model_retrieval.ipynb

In [6]:
load_huggingface_model(ai_lab_config, NAMED_ENTITY_MODEL, 'token-classification')

In [7]:
load_huggingface_model(ai_lab_config, NLI_MODEL, 'zero-shot-classification')

In [8]:
load_huggingface_model(ai_lab_config, FEATURE_EXTRACTION_MODEL, 'feature-extraction')

## Run Preprocessing for Demo

In [11]:
from exasol.nb_connector.connections import open_pyexasol_connection
from exasol.nb_connector.language_container_activation import get_activation_sql

activation_sql = get_activation_sql(ai_lab_config)

In [39]:
extraction = Extraction(
    extractor=PipelineExtractor(
        steps=[
            SourceTableExtractor(sources=[
                SchemaSource(db_schema=NameSelector(pattern=schema),
                     tables=[
                         TableSource(table=NameSelector(pattern="CUSTOMER_SUPPORT_TICKETS"),
                                     columns=[NameSelector(pattern="TICKET_DESCRIPTION")],
                                     keys=[NameSelector(pattern="TICKET_ID")])
                     ])
            ]),
            DefaultExtractor(
                named_entity_recognition_enabled = True,
                topic_classification_enabled = True,
                keyword_search_enabled = True,
                topics=["urgent", "not urgent"], 
                parallelism_per_node=2)
        ]
    ),
    output=Output(db_schema=schema)
)

In [40]:
with open_pyexasol_connection(ai_lab_config, compression=True) as conn:
    conn.execute(query=activation_sql)
    extraction.run(conn, schema, "PYTHON3_TXAIE")

In [7]:
%run ../../utils/jupysql_init.ipynb

In [8]:
%config SqlMagic.displaylimit = 20

In [9]:
%%sql
SELECT TABLE_SCHEMA, TABLE_NAME FROM EXA_ALL_TABLES

table_schema,table_name
AI_LAB,CUSTOMER_SUPPORT_TICKETS
AI_LAB,PRODUCTS
AI_LAB,DOCUMENTS
AI_LAB,DOCUMENTS_AI_LAB_CUSTOMER_SUPPORT_TICKETS
AI_LAB,NAMED_ENTITY
AI_LAB,NAMED_ENTITY_LOOKUP_ENTITY_TYPE
AI_LAB,NAMED_ENTITY_LOOKUP_SETUP
AI_LAB,TOPIC_CLASSIFIER
AI_LAB,TOPIC_CLASSIFIER_LOOKUP_TOPIC
AI_LAB,TOPIC_CLASSIFIER_LOOKUP_SETUP


In [10]:
%%sql
SELECT VIEW_SCHEMA, VIEW_NAME FROM EXA_ALL_VIEWS

view_schema,view_name
AI_LAB,NAMED_ENTITY_VIEW
AI_LAB,KEYWORD_SEARCH_VIEW
AI_LAB,TOPIC_CLASSIFIER_VIEW
AI_LAB,ENTITIES_WITH_TOPICS
AI_LAB,URGENT_PRODUCTS


In [11]:
%%sql
SELECT * FROM {{schema}}.DOCUMENTS as d

text_doc_id,text_char_begin,text_char_end,TEXT
6985,0,245,"I'm having an issue with the Bose QuietComfort. Please assist. Click here to return to the previous page or to return to the following page. The issue I'm facing is intermittent. Sometimes it works fine, but other times it acts up unexpectedly."
6986,0,248,I'm having an issue with the Nikon D. Please assist. The Nikon D will be sold after this month's month break. After the end of this month the Nikon D will I'm concerned about the security of my Nikon D and would like to ensure that my data is safe.
6987,0,253,"I'm having an issue with the Canon DSLR Camera. Please assist. A free copy of ""Goodbye, my little #shitty boy"" is also released as both a Kindle and Nook Edition. I've already contacted customer support multiple times, but the issue remains unresolved."
6988,0,292,"I've noticed a software bug in the HP Pavilion app. It's causing data loss and unexpected errors. How can I resolve this issue? If you're not sure if the app is stable, report it to us I've performed a factory reset on my HP Pavilion, hoping it would resolve the problem, but it didn't help."
6989,0,310,"I'm having trouble connecting my Amazon Echo to my home Wi-Fi network. It doesn't detect any networks, although other devices are connecting fine. What can be done to resolve this issue? The ""Connecting Devices"", I'm concerned about the security of my Amazon Echo and would like to ensure that my data is safe."
6990,0,291,"I'm having an issue with the Fitbit Versa Smartwatch. Please assist. If you're not sure which product is right for you, visit our support page. If you want to update this guide, we strongly suggest you I've checked the device settings and made sure that everything is configured correctly."
6991,0,315,"There seems to be a glitch in the Canon DSLR Camera software. It freezes frequently, making it difficult to use. Can you please provide a solution? GIF recolors are the best tool in the game to repaint This problem started occurring after the recent software update. I haven't made any other changes to the device."
6992,0,325,"I've recently set up my Microsoft Xbox Controller, but it fails to connect to any available networks. What steps should I take to troubleshoot this issue? If you're the user that's using Microsoft Windows Media Encoder I've tried different settings and configurations on my Microsoft Xbox Controller, but the issue persists."
6993,0,277,"My Bose QuietComfort is making strange noises and not functioning properly. I suspect there might be a hardware issue. Can you please help me with this? [19:39:34]SAY: Medibot/ : I've tried different settings and configurations on my Bose QuietComfort, but the issue persists."
6994,0,304,"I've accidentally deleted important data from my Samsung Soundbar. Is there any way to recover the deleted files? I need them urgently. 1k Hi, I've accidentally deleted important data from my product. Is there I've checked for any available software updates for my Samsung Soundbar, but there are none."


In [13]:
%%sql
SELECT t.TOPIC, t.TOPIC_SCORE, d.TEXT_DOC_ID, d.TEXT_CHAR_BEGIN, d.TEXT_CHAR_END, d.TEXT
FROM {{schema}}.TOPIC_CLASSIFIER_VIEW as t JOIN {{schema}}.DOCUMENTS as d ON d.TEXT_DOC_ID = t.TEXT_DOC_ID 
WHERE t.TOPIC_RANK=1

topic,topic_score,text_doc_id,text_char_begin,text_char_end,TEXT
not urgent,0.6771222352981567,3856,0,334,"I'm having an issue with the Dell XPS. Please assist. We understand that some customers have found this to be confusing. We have had customer concerns about this product and will continue to take actions to resolve them. Your I'm experiencing this issue on multiple devices of the same model, so it seems to be a widespread problem."
urgent,0.9683608412742616,6218,0,334,"I've accidentally deleted important data from my Microsoft Xbxo Controller. Is there any way to recover the deleted files? I need them urgently. I am in the middle of a divorce case. Am I able to get a lawyer for those files I've reviewed the troubleshooting steps on the official support website, but they didn't resolve the problem."
not urgent,0.593508780002594,396,0,334,"I'm having an issue with the Bose SoundLink Speaker. Please assist. The first couple times I've tried to make an effort to make a transaction after using the product, it just gets stuck and I don't think many people I've performed a factory reset on my Bose SoundLink Speaker, hoping it would resolve the problem, but it didn't help."
not urgent,0.6238230466842651,5378,0,334,I'm having an issue with the Philips Hue Lights. Please assist. The '#refresh' setting was not found on this specific product. The product information is not currently available. Please check your product details before purchasing. I've noticed a sudden decrease in battery life on my Philips Hue Lights. It used to last much longer.
urgent,0.715283989906311,7064,0,334,"I'm having an issue with the Adobe Photoshop. Please assist. Please. I am currently working on updating the packaging for this product and I'm afraid most of the pictures are incomplete, so please feel free to ask if you have I've tried using different cables, adapters, or peripherals with my Adobe Photoshop, but the issue persists."
not urgent,0.5347158312797546,1150,0,334,"I'm having an issue with the Microsoft Surface. Please assist. The products and services offered by the Vendor are the legal property of their respective Ownerships, and in no circumstances are these services or the ownership or control of I've tried clearing the cache and data for the Microsoft Surface app, but the issue persists."
not urgent,0.6249628067016602,3966,0,334,"The Sony Xpeira is unable to establish a stable internet connection. It keeps disconnecting intermittently. How can I troubleshoot this network problem? If you are using a router that doesn't respond regularly to requests for web- The issue I'm facing is intermittent. Sometimes it works fine, but other times it acts up unexpectedly."
not urgent,0.7034529447555542,3216,0,334,"I've encountered a data loss issue with my Nikon D. All the files and documents seem to have disappeared. Can you guide me on how to retrieve them? Thanks for the question. My data was stolen by an unknown I've recently updated the firmware of my Nikon D, and the issue started happening afterward. Could it be related to the update?"
not urgent,0.7696776986122131,2044,0,334,"There seems to be a hardware problem with my Canon EOS. The screen is flickering, and I'm unable to use it. What should I do? The answer is follow the instructions given in the FAQ: #Make sure the package I've recently updated the firmware of my Canon EOS, and the issue started happening afterward. Could it be related to the update?"
not urgent,0.667040228843689,6594,0,334,"I'm having an issue with the Samsung Soundbra. Please assist. If you have any questions, please send an email to support.lucky@gmail.com if you have any information about the product(s) sold and we I've recently updated the firmware of my Samsung Soundbra, and the issue started happening afterward. Could it be related to the update?"


In [48]:
%%sql
SELECT e.ENTITY_DOC_ID, e.ENTITY_CHAR_BEGIN, e.ENTITY_CHAR_END, e.ENTITY_TYPE, e.ENTITY, e.ENTITY_SCORE, d.TEXT
FROM {{schema}}.NAMED_ENTITY_VIEW as e JOIN {{schema}}.DOCUMENTS as d ON d.TEXT_DOC_ID = e.TEXT_DOC_ID

entity_doc_id,entity_char_begin,entity_char_end,entity_type,entity,entity_score,TEXT
4097,29,40,product_other,Sony Xperia,0.9134032726287842,I'm having an issue with the Sony Xperia. Please assist. Product ID: 0 3 Sold Unavailable I'm not sure if this issue is specific to my device or if others have reported similar problems.
4098,29,43,product_other,Soyn K4 HDR Tv,0.9644144773483276,"I'm having an issue with the Soyn K4 HDR Tv. Please assist. This is a message from the shop owner: Please help out with this product. If you will help out, I'm sure you can buy it out, I've recently updated the firmware of my Soyn K4 HDR Tv, and the issue started happening afterward. Could it be related to the update?"
4098,227,241,product_other,Soyn K4 HDR Tv,0.9642658233642578,"I'm having an issue with the Soyn K4 HDR Tv. Please assist. This is a message from the shop owner: Please help out with this product. If you will help out, I'm sure you can buy it out, I've recently updated the firmware of my Soyn K4 HDR Tv, and the issue started happening afterward. Could it be related to the update?"
4099,49,74,product_other,Microsoft Xbox Controller,0.7737916111946106,"I've accidentally deleted important data from my Microsoft Xbox Controller. Is there any way to recover the deleted files? I need them urgently. If I don't have them, then I won't get my credit card charged. Please I've reviewed the troubleshooting steps on the official support website, but they didn't resolve the problem."
4100,29,46,product_other,aGrmin Forerunmer,0.8641901016235352,"I'm having an issue with the aGrmin Forerunmer. Please assist. The problem is, it appears my keyboard needs some work. On Windows it needs to be running for at least 3x as many tasks, to actually see I've recently updated the firmware of my aGrmin Forerunmer, and the issue started happening afterward. Could it be related to the update?"
4100,124,131,product_software,Windows,0.6563641428947449,"I'm having an issue with the aGrmin Forerunmer. Please assist. The problem is, it appears my keyboard needs some work. On Windows it needs to be running for at least 3x as many tasks, to actually see I've recently updated the firmware of my aGrmin Forerunmer, and the issue started happening afterward. Could it be related to the update?"
4100,243,260,product_other,aGrmin Forerunmer,0.881909191608429,"I'm having an issue with the aGrmin Forerunmer. Please assist. The problem is, it appears my keyboard needs some work. On Windows it needs to be running for at least 3x as many tasks, to actually see I've recently updated the firmware of my aGrmin Forerunmer, and the issue started happening afterward. Could it be related to the update?"
4101,29,43,product_other,Sony 4K HDR Tv,0.688877284526825,"I'm having an issue with the Sony 4K HDR Tv. Please assist. I've checked for any available software updates for my Sony 4K HDR Tv, but there are none."
4101,115,129,product_other,Sony 4K HDR Tv,0.7083435654640198,"I'm having an issue with the Sony 4K HDR Tv. Please assist. I've checked for any available software updates for my Sony 4K HDR Tv, but there are none."
4102,4,15,product_other,LG Smart TV,0.9026075005531312,"The LG Smart TV is unable to establish a stable internet connection. It keeps disconnecting intermittently. How can I troubleshoot this network problem? First, download and install the Java Platform on your computer's machine (or anywhere I've noticed a sudden decrease in battery life on my LG Smart TV. It used to last much longer."


In [49]:
%%sql
SELECT
    k.KEYWORD_DOC_ID, k.KEYWORD_CHAR_BEGIN, k.KEYWORD_CHAR_END, 
    k.KEYWORD, k.KEYWORD_SCORE,
    d.TEXT
FROM {{schema}}.KEYWORD_SEARCH_VIEW as k
JOIN {{schema}}.DOCUMENTS as d
ON d.TEXT_DOC_ID = k.TEXT_DOC_ID
ORDER BY k.KEYWORD_DOC_ID, k.KEYWORD_SCORE DESC

keyword_doc_id,keyword_char_begin,keyword_char_end,keyword,keyword_score,TEXT
1,54,64,fi network,0.7114,"I'm having trouble connecting my iPhone to my home Wi-Fi network. It doesn't detect any networks, although other devices are connecting fine. What can be done to resolve this issue? The Wi-Fi The issue I'm facing is intermittent. Sometimes it works fine, but other times it acts up unexpectedly."
1,11,18,trouble,0.6919,"I'm having trouble connecting my iPhone to my home Wi-Fi network. It doesn't detect any networks, although other devices are connecting fine. What can be done to resolve this issue? The Wi-Fi The issue I'm facing is intermittent. Sometimes it works fine, but other times it acts up unexpectedly."
1,33,39,iphone,0.6882,"I'm having trouble connecting my iPhone to my home Wi-Fi network. It doesn't detect any networks, although other devices are connecting fine. What can be done to resolve this issue? The Wi-Fi The issue I'm facing is intermittent. Sometimes it works fine, but other times it acts up unexpectedly."
1,88,96,networks,0.6825,"I'm having trouble connecting my iPhone to my home Wi-Fi network. It doesn't detect any networks, although other devices are connecting fine. What can be done to resolve this issue? The Wi-Fi The issue I'm facing is intermittent. Sometimes it works fine, but other times it acts up unexpectedly."
1,260,271,other times,0.6698,"I'm having trouble connecting my iPhone to my home Wi-Fi network. It doesn't detect any networks, although other devices are connecting fine. What can be done to resolve this issue? The Wi-Fi The issue I'm facing is intermittent. Sometimes it works fine, but other times it acts up unexpectedly."
2,117,138,| order_total=3540.00,0.867,I'm having an issue with the Canon DSLR Camear. Please assist. Please try again later. | product_purchased=product | order_total=3540.00 The result is that I I need assistance as soon as possible because it's affecting my work and productivity.
2,234,246,productivity,0.7713,I'm having an issue with the Canon DSLR Camear. Please assist. Please try again later. | product_purchased=product | order_total=3540.00 The result is that I I need assistance as soon as possible because it's affecting my work and productivity.
2,29,34,canon,0.7544,I'm having an issue with the Canon DSLR Camear. Please assist. Please try again later. | product_purchased=product | order_total=3540.00 The result is that I I need assistance as soon as possible because it's affecting my work and productivity.
2,40,46,camear,0.735,I'm having an issue with the Canon DSLR Camear. Please assist. Please try again later. | product_purchased=product | order_total=3540.00 The result is that I I need assistance as soon as possible because it's affecting my work and productivity.
2,108,115,product,0.7173,I'm having an issue with the Canon DSLR Camear. Please assist. Please try again later. | product_purchased=product | order_total=3540.00 The result is that I I need assistance as soon as possible because it's affecting my work and productivity.
