
# AI Mirror – NLP + Dialogflow (Prototype)

This is **simple MVP prototype** for ZPDS.

What it does:
- user pastes text (like speech → text)
- click button **Analyze**
- system:
  - analyzes text (basic NLP)
  - detects emotion (stress / calm)
  - sends text to **Dialogflow** (intent + reply)


## 1. Install libraries


In [1]:

!pip install google-cloud-dialogflow==2.26.0 ipywidgets nltk


Collecting protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5 (from google-cloud-dialogflow==2.26.0)
  Using cached protobuf-4.25.8-cp37-abi3-manylinux2014_x86_64.whl.metadata (541 bytes)
Using cached protobuf-4.25.8-cp37-abi3-manylinux2014_x86_64.whl (294 kB)
Installing collected packages: protobuf
  Attempting uninstall: protobuf
    Found existing installation: protobuf 5.29.5
    Uninstalling protobuf-5.29.5:
      Successfully uninstalled protobuf-5.29.5
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
nemo-toolkit 2.6.0 requires protobuf~=5.29.5, but you have protobuf 4.25.8 which is incompatible.
ydf 0.13.0 requires protobuf<7.0.0,>=5.29.1, but you have protobuf 4.25.8 which is incompatible.
opentelemetry-proto 1.37.0 requires protobuf<7.0,>=5.0, but you have protobuf 4.25.8 which is incompatible.[0m[


## 2. Imports and setup


In [2]:

import nltk
import re
from nltk.tokenize import sent_tokenize, word_tokenize
import ipywidgets as widgets
from IPython.display import display

nltk.download('punkt')


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

# AI Mirror – MVP (zgodnie z prezentacją)

## Schemat działania

Ekran 3 – Nagrywanie  
→ Speech-to-Text (transkrypcja)

Ekran 4 – Analiza (AI Mirror)  
→ NLP (tempo, długość, emocje)  
→ Dialogflow (intencja wypowiedzi)

Ekran 5 – Feedback AI  
→ opis jak brzmi użytkownik (bez ocen)

Ekran 6 – Stress Insight  
→ prosty komunikat o stresie



## 3. Simple NLP + Emotion logic


In [3]:

stress_words = ['stress', 'nervous', 'afraid', 'panic', 'problem', 'hard']
calm_words = ['calm', 'ok', 'fine', 'good', 'safe', 'relaxed']

def analyze_text(text):
    words = word_tokenize(text.lower())
    sentences = sent_tokenize(text)

    stress_score = sum(1 for w in words if w in stress_words)
    calm_score = sum(1 for w in words if w in calm_words)

    if stress_score > calm_score:
        emotion = 'STRESS'
    elif calm_score > stress_score:
        emotion = 'CALM'
    else:
        emotion = 'NEUTRAL'

    return {
        'words': len(words),
        'sentences': len(sentences),
        'emotion': emotion
    }



## 4. Dialogflow – how it works (simple)

We send text → Dialogflow  
Dialogflow returns:
- intent name
- text answer

Below is **mock version** (safe for demo).
Real API needs Google Cloud credentials.


In [4]:
def dialogflow_mock(emotion):
    if emotion == "STRESS":
        return "Brzmisz na spiętego. Mówisz szybko i bez pauz. To normalne przed prezentacją."
    elif emotion == "CALM":
        return "Brzmisz spokojnie. Tempo jest równe i łatwo cię zrozumieć."
    else:
        return "Brzmisz neutralnie. Możesz dodać krótką pauzę, żeby było jaśniej."



## 5. UI – like prototype buttons


In [5]:

input_text = widgets.Textarea(
    placeholder='Paste your text here...',
    layout=widgets.Layout(width='100%', height='120px')
)

analyze_btn = widgets.Button(description='Analyze')
output = widgets.Output()

def on_click(b):
    output.clear_output()
    text = input_text.value

    result = analyze_text(text)
    dialogflow_reply = dialogflow_mock(text, result['emotion'])

    with output:
        print('Words:', result['words'])
        print('Sentences:', result['sentences'])
        print('Emotion:', result['emotion'])
        print('AI Mirror says:', dialogflow_reply)

analyze_btn.on_click(on_click)

display(input_text, analyze_btn, output)


Textarea(value='', layout=Layout(height='120px', width='100%'), placeholder='Paste your text here...')

Button(description='Analyze', style=ButtonStyle())

Output()


## 6. How to connect REAL Dialogflow later

1. Create agent in Dialogflow (https://dialogflow.cloud.google.com/#/login, not dialogflow.com - ostatni nie dziala)
2. Enable API in Google Cloud
3. Create service account key (JSON)
4. Replace `dialogflow_mock()` with real API call


In [21]:
# @title Łączenie notatnika colab z dyskiem google
##### dane w folderze środowiska uruchomieniowego
# data_dir = "/content/sample_data/"

##### dane na Google Drive
from google.colab import drive
drive.mount("/content/drive")
#@markdown Wpisz ścieżkę do zapisu pliku
data_dir = "/content/drive/My Drive/Colab Notebooks/2 сем 2 курс/ZPDS project/" #@param {type:"string"}

##### dane w lokalnym folderze - uruchamianie skrytpu .py w lokalnym środowisku
##### дані в локальній папці - запуск сценарію .py в локальному середовищі
# data_dir = "./"

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


MODEL TTS GOOGLE

In [6]:
# @title Krok 1: Instalacja bibliotek
!apt-get install -y portaudio19-dev
!pip install pyaudio
!pip install scipy
!pip install gTTS

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libasound2-dev libjack-dev libjack0 libportaudio2 libportaudiocpp0
Suggested packages:
  libasound2-doc jackd1 portaudio19-doc
The following packages will be REMOVED:
  libjack-jackd2-0
The following NEW packages will be installed:
  libasound2-dev libjack-dev libjack0 libportaudio2 libportaudiocpp0
  portaudio19-dev
0 upgraded, 6 newly installed, 1 to remove and 1 not upgraded.
Need to get 596 kB of archives.
After this operation, 3,178 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/universe amd64 libjack0 amd64 1:0.125.0-3build2 [93.3 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy/main amd64 libasound2-dev amd64 1.2.6.1-1ubuntu1 [110 kB]
Get:3 http://archive.ubuntu.com/ubuntu jammy/universe amd64 libjack-dev amd64 1:0.125.0-3build2 [206 kB]
Get:4 http://archive.ubuntu.com/ubuntu jammy/universe

In [22]:
# @title Krok 2: Przygotowanie danych do syntezy
#@markdown Wpisz zdanie do syntezy
text = "siemka, no jak tam?" #@param {type:"string"}
#@markdown Wpisz nazwę, pod którą chcesz zpisać plik
synthesis_filename= "google_tts" #@param {type:"string"}

In [23]:
# @title Krok 3: Synteza mowy i zapisanie wyniku (kroki załadoawnie modelu i synteza mowy są realizowane jednocześnie przez funkcję gTTS z pakietu gtts)
from scipy.io import wavfile
from gtts import gTTS
from IPython.display import Audio

save_path_2 = f'{data_dir}/{synthesis_filename}.wav'

tts = gTTS(text,lang='pl')
tts.save(save_path_2)

Audio(save_path_2, autoplay=True)

### MODEL MMS TTS POL

In [24]:
# @title Krok 1: Instalacja bibliotek
!pip install --upgrade transformers accelerate

Collecting transformers
  Downloading transformers-4.57.3-py3-none-any.whl.metadata (43 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.0/44.0 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers<=0.23.0,>=0.22.0 (from transformers)
  Downloading tokenizers-0.22.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.8 kB)
Downloading transformers-4.57.3-py3-none-any.whl (12.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.0/12.0 MB[0m [31m62.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading tokenizers-0.22.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.3/3.3 MB[0m [31m86.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tokenizers, transformers
  Attempting uninstall: tokenizers
    Found existing installation: tokenizers 0.21.4
    Uninstalling tokenizers-0.21.4:
      Successfully uninstalled tokenizers

In [25]:
# @title Krok 2: Przygotowanie danych do syntezy

#@markdown Wpisz zdanie do syntezy
text = "Co słychać?" #@param {type:"string"}
#@markdown Wpisz nazwę, pod którą chcesz zpisać plik
synthesis_filename= "meta_tts" #@param {type:"string"}

In [26]:
# @title Krok 3: Załadowanie modelu MMS TTS POL

from transformers import VitsModel, AutoTokenizer
import torch

model = VitsModel.from_pretrained("facebook/mms-tts-pol")
tokenizer = AutoTokenizer.from_pretrained("facebook/mms-tts-pol")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/145M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/287 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/510 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/47.0 [00:00<?, ?B/s]

In [28]:
# @title Krok 4: Synteza mowy i zapisanie wyniku
import scipy
from IPython.display import Audio

save_path_2 = f'{data_dir}/{synthesis_filename}.wav'

inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    output = model(**inputs).waveform

# Convert the tensor to a NumPy array
output_np = output.squeeze().cpu().numpy()

# Save the audio
scipy.io.wavfile.write(save_path_2, rate=model.config.sampling_rate, data=output_np)

Audio(output_np, rate=model.config.sampling_rate)

### MODEL XTTS

In [29]:
# @title Krok 1: Instalacja bibliotek
!pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
!pip install TTS
!pip install --upgrade TTS

Looking in indexes: https://download.pytorch.org/whl/cu118
[31mERROR: Could not find a version that satisfies the requirement torch==2.1.0 (from versions: 2.2.0+cu118, 2.2.1+cu118, 2.2.2+cu118, 2.3.0+cu118, 2.3.1+cu118, 2.4.0+cu118, 2.4.1+cu118, 2.5.0+cu118, 2.5.1+cu118, 2.6.0+cu118, 2.7.0+cu118, 2.7.1+cu118)[0m[31m
[0m[31mERROR: No matching distribution found for torch==2.1.0[0m[31m
[0m[31mERROR: Ignored the following versions that require a different python version: 0.0.10.2 Requires-Python >=3.6.0, <3.9; 0.0.10.3 Requires-Python >=3.6.0, <3.9; 0.0.11 Requires-Python >=3.6.0, <3.9; 0.0.12 Requires-Python >=3.6.0, <3.9; 0.0.13.1 Requires-Python >=3.6.0, <3.9; 0.0.13.2 Requires-Python >=3.6.0, <3.9; 0.0.14.1 Requires-Python >=3.6.0, <3.9; 0.0.15 Requires-Python >=3.6.0, <3.9; 0.0.15.1 Requires-Python >=3.6.0, <3.9; 0.0.9 Requires-Python >=3.6.0, <3.9; 0.0.9.1 Requires-Python >=3.6.0, <3.9; 0.0.9.2 Requires-Python >=3.6.0, <3.9; 0.0.9a10 Requires-Python >=3.6.0, <3.9; 0.0.9a9 R

In [34]:

# @title Model TTS Google - synteza mowy, znany dla modelu głos
from scipy.io import wavfile
from gtts import gTTS
from IPython.display import Audio

#@markdown Wpisz zdanie do syntezy
text = "Hahaha test 1, 2, 3" #@param {type:"string"}
#@markdown Wpisz nazwę, pod którą chcesz zpisać plik
synthesis_filename= "test123" #@param {type:"string"}
#@markdown Wybierz język syntezy (polski lub angielski)
lang = 'pl' #@param ["en", "pl"]

save_path = f'{data_dir}/{synthesis_filename}.wav'

tts = gTTS(text,lang=lang)
tts.save(save_path)

Audio(save_path, autoplay=True)