<a href="https://colab.research.google.com/github/rainbowcores/vulavula-test/blob/main/Copy_of_Vulavula_API_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<div>
<img src="https://drive.google.com/uc?id=1XhgzBxVSyQpP_oK_5biNDWDDgUicK0x_" width="400" class="center">
</div>


# WELCOME TO THE VULAVULA DEMO NOTEBOOK!

We are so glad you are here! This notebook will help get you started in using the Vulavula API. You are able to access four different models from this notebook. Please note that you first need to create a `<VULAVULA TOKEN>` from the Vulavula Platform at this link: https://vulavula.lelapa.ai/

After you have your token replace the `<VULAVULA TOKEN>` text below and you will be able to continue with the rest of the cells.

**Note: Currently supports Python 3.10+**

In [None]:
# Installations
!pip install sounddevice wavio
!pip install ipywebrtc notebook
!apt install ffmpeg
!apt-get install libportaudio2

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libportaudio2 is already the newest version (19.6.0-1.1).
0 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.


In [None]:
# Get your VULAVULA_TOKEN by logging in and getting keys
VULAVULA_TOKEN = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VyX2lkIjoiZjBhZmYxZWItZjlhZC00MjI1LWJhNTMtOGQ1MDk5MDJlNzk0Iiwia2V5Y2hhaW5faWQiOiJjYmUzMzUzNC02MzgyLTRiN2ItYjY0MC1iYWUwOGUyMWM4YWQiLCJwcm9qZWN0X2lkIjoiZDMxM2E5OWMtMjgxMy00YjdmLWI4MTgtMjk0NDRmNDBhMGIxIiwiY3JlYXRlZF9hdCI6IjIwMjUtMDYtMjZUMDc6NTY6NTQuODUwNDkyIn0.V0QJr4tYPHwsunkPy-rnpo83Yq55_Trx0c8WbI2XH6Q"

To get started, you first need to install the `vulavula` SDK.

After the installation, import the `VulavulaClient` from the vulavula module and initialize it with your API token.

In [None]:
!pip install -U vulavula  -q

from vulavula import VulavulaClient
from vulavula.common.error_handler import VulavulaError

client = VulavulaClient(VULAVULA_TOKEN)

# Vulavula Transcribe

We haven't yet integrated our new synchronous endpoint into the VulaVulaClient SDK. Until then, please accept our apologies in the form of an example using Python `requests`


## Record your 10s sound snippet to be transcribed

🛠 **Feature**: Vulavula Transcribe

👅 **Languages**: South African English, isiZulu, Sesotho, Afrikaans


**NOTE**: you can specify what language you want to transcribe later on in the notebook


### Let's import and setup some important things:

In [None]:
import requests
import os
import torch
import pandas as pd
import torchaudio
import base64
from ipywebrtc import AudioRecorder, CameraStream
from IPython.display import Audio, display
import ipywidgets as widgets

transcribe_url = "https://vulavula-services.lelapa.ai//api/v1/transcribe/sync"

### Let's record a sound file!

In [None]:
from google.colab import output
output.enable_custom_widget_manager()

In [None]:
camera = CameraStream(constraints={'audio': True,'video':False})
recorder = AudioRecorder(stream=camera)
recorder

AudioRecorder(audio=Audio(value=b'', format='webm'), stream=CameraStream(constraints={'audio': True, 'video': …

In [None]:
with open('recording.webm', 'wb') as f:
    f.write(recorder.audio.value)
!ffmpeg -i recording.webm -ac 1 -f wav my_recording.wav -y -hide_banner -loglevel panic

Woop woop. **YOU ARE NOW READY TO GET YOUR TRANSCRIPTION!!**

This is example code for our synchronous transcribe endpoint. We have an async option available via webhooks in our documentation

 This is the part where you can specify a language code to select which language you want to transcribe. The following language codes are valid

* AFRIKAANS = "afr"
* ISIZULU = "zul"
* SESOTHO = "sot"
* RSA_ENGLISH = "eng"

If no language code is specified, our built-in language ID  will select the most probable language.

In [None]:
try:
    with open('/content/an4_diarize_test.wav', 'rb') as file:
        encoded_wav = base64.b64encode(file.read()).decode('utf-8')
    json = {
      "file_name": "my_recording.wav",
      "audio_blob": encoded_wav,
      "file_size": 0, # this parameter is no longer used, but is still not optional! sarie!
    }
    headers = {
        'Content-Type': 'application/json',
        'X-CLIENT-TOKEN': VULAVULA_TOKEN,

    }
    response = requests.post(
        transcribe_url,
        json = json,
        headers = headers,
    )
except Exception as e:
    print("An unexpected error occurred:", str(e))

An unexpected error occurred: [Errno 2] No such file or directory: '/content/an4_diarize_test.wav'


In [None]:
# Get the status code
print(f'Status Code: {response.status_code}')

# If the response is in JSON format, you can get it as a dictionary:
try:
    response_json = response.json()  # Converts response to JSON format
    print(f'Response JSON: {response_json}')
except ValueError:
    print("Response is not in JSON format")

Status Code: 200
Response JSON: {'transcription_text': None, 'language_code': None, 'diarisation_result': None, 'audio_length_s': None, 'transcription_status': 'FAILED', 'error_message': '500 Server Error: Internal Server Error for url: http://vv-shovel/transcribe/from_azure'}


🚨 **DID YOU GET A 503 or 504 response???**

**NOTE**: Our models sometimes go to sleep (because running all day is tiring and expensive). We're busy streamlining our morning routine and optimizing our energy. Until we get that right, you might occasionally receive a 503 HTTP error when using our APIs.

We ask you to retry the call until the model has woken up.

#Vulavula Analyse

## Sentiment Analysis: isiZulu text sentiment

🛠 **Feature**: Vulavula Analyse

👅 **Languages**: isiZulu

Sentiment is simpler. Barely any setup required. Just enter your isiZulu statement below, and run the cell

In [None]:
sentence = "Ngikwadile!" # I am happy

Now run the API request to extract entities from your sentence

In [None]:
try:
    data = {
        "encoded_text": sentence
    }
    sentiment_result = client.get_sentiments(data)
    print(sentiment_result)
except VulavulaError as e:
    print("An error occurred:", e.message)
    if 'details' in e.error_data:
        print("Error Details:", e.error_data['details'])
    else:
        print("No additional error details are available.")
except Exception as e:
    print("An unexpected error occurred:", str(e))


An unexpected error occurred: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))




---



## Entity Recognition: Extract Multilingual Entities

🛠 **Feature**: Vulavula Analyse

👅 **Languages**: isiZulu (but it is actually multilingual)

Entity recognition is simple too. Just provide a sentence below to extract entities from:

In [None]:
entity_sentence = "President Ramaphosa gaan loop by Emfuleni Municipality"

Now run the API request to extract entities from your sentence

In [None]:
try:
    entity_result = client.get_entities({"encoded_text": entity_sentence})
    print("Entity Recognition Output:", entity_result)
except VulavulaError as e:
    print("An error occurred:", e.message)
    if 'details' in e.error_data:
        print("Error Details:", e.error_data['details'])
    else:
        print("No additional error details are available.")
except Exception as e:
    print("An unexpected error occurred:", str(e))

Entity Recognition Output: [{'entity': 'person', 'word': 'Ramaphosa', 'start': 10, 'end': 19}, {'entity': 'location', 'word': 'Emfuleni Municipality', 'start': 33, 'end': 54}]


# Vulavula Converse

## Few-shot learning for intent

🛠 Feature: Vulavula Converse

👅 Languages: isiZulu

NLU: Converse supports Natural Language Understanding (NLU) through Intent Classification and Named Entity Recognition (from the Analyse functionality above).

In the cell below, provide a few examples of the intents and responses you are looking to be able to identify as well as the actual message you have received in your conversation flow.

In [None]:
# Request body
data = {
    "examples": [
        {"intent": "greeting", "example": "Habari yako"}, # how are you
        {"intent": "greeting", "example": "Hujambo"}, # hi
        {"intent": "goodbye", "example": "Kwaheri"}, #bye
        {"intent": "goodbye", "example": "Tuonane badae"} # Dictionary of intents you want to be able to process and a few examples
    ],
    "inputs": [
        "Uko mzima?",
        "Naondoka naenda zangu" #The actual message received from your messaging platform that you want to classify into and intent
    ]
}

Now that you have created your request body you can get the intent classified by sending through a simple post request

In [None]:
try:
  classification_results = client.classify(data)
  print("Classification Results:", classification_results)
except VulavulaError as e:
    print("An error occurred:", e.message)
    if 'details' in e.error_data:
        print("Error Details:", e.error_data['details'])
    else:
        print("No additional error details are available.")
except Exception as e:
    print("An unexpected error occurred:", str(e))

Classification Results: [{'probabilities': [{'intent': 'goodbye', 'score': 0.23908754}, {'intent': 'greeting', 'score': 0.7609125}]}, {'probabilities': [{'intent': 'goodbye', 'score': 0.3370287}, {'intent': 'greeting', 'score': 0.66297126}]}]


# Vulavula Translate

### Multilingual Machine Translation

🛠 Feature: Vulavula Translate

👅 Languages: Northern Sotho (nso_Latn) , Afrikaans (afr_Latn), Southern Sotho (sot_Latn),Swati (ssw_Latn), Tsonga (tso_Latn),Tswana (tsn_Latn) ,Xhosa (xho_Latn) , Zulu (zul_Latn) , English (eng_Latn) , Swahili (swh_Latn)



The translation functionality supports a variety of languages, making it a powerful tool for multilingual communication. All you need to do is provide the input text `input_text` , source `source_lang` and target `target_lang` language .

In [None]:

translation_data = {
  "input_text": "Lo musho ubhalwe ngesiZulu.",
  "source_lang": "zul_Latn",
  "target_lang": "eng_Latn"
}

translation_result = client.translate(translation_data)
print("Translation Result:", translation_result)

VulavulaError: Vulavula errors - Details: {"error": true, "status_code": 504, "message": "API error: 504", "details": "<!DOCTYPE html>\n<!--[if lt IE 7]> <html class=\"no-js ie6 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if IE 7]>    <html class=\"no-js ie7 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if IE 8]>    <html class=\"no-js ie8 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if gt IE 8]><!--> <html class=\"no-js\" lang=\"en-US\"> <!--<![endif]-->\n<head>\n\n\n<title>vulavula-services.lelapa.ai | 504: Gateway time-out</title>\n<meta charset=\"UTF-8\" />\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\" />\n<meta http-equiv=\"X-UA-Compatible\" content=\"IE=Edge\" />\n<meta name=\"robots\" content=\"noindex, nofollow\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1\" />\n<link rel=\"stylesheet\" id=\"cf_styles-css\" href=\"/cdn-cgi/styles/main.css\" />\n\n\n</head>\n<body>\n<div id=\"cf-wrapper\">\n    <div id=\"cf-error-details\" class=\"p-0\">\n        <header class=\"mx-auto pt-10 lg:pt-6 lg:px-8 w-240 lg:w-full mb-8\">\n            <h1 class=\"inline-block sm:block sm:mb-2 font-light text-60 lg:text-4xl text-black-dark leading-tight mr-2\">\n              <span class=\"inline-block\">Gateway time-out</span>\n              <span class=\"code-label\">Error code 504</span>\n            </h1>\n            <div>\n               Visit <a href=\"https://www.cloudflare.com/5xx-error-landing?utm_source=errorcode_504&utm_campaign=vulavula-services.lelapa.ai\" target=\"_blank\" rel=\"noopener noreferrer\">cloudflare.com</a> for more information.\n            </div>\n            <div class=\"mt-3\">2024-08-23 12:49:51 UTC</div>\n        </header>\n        <div class=\"my-8 bg-gradient-gray\">\n            <div class=\"w-240 lg:w-full mx-auto\">\n                <div class=\"clearfix md:px-8\">\n                  \n<div id=\"cf-browser-status\" class=\" relative w-1/3 md:w-full py-15 md:p-0 md:py-8 md:text-left md:border-solid md:border-0 md:border-b md:border-gray-400 overflow-hidden float-left md:float-none text-center\">\n  <div class=\"relative mb-10 md:m-0\">\n    \n    <span class=\"cf-icon-browser block md:hidden h-20 bg-center bg-no-repeat\"></span>\n    <span class=\"cf-icon-ok w-12 h-12 absolute left-1/2 md:left-auto md:right-0 md:top-0 -ml-6 -bottom-4\"></span>\n    \n  </div>\n  <span class=\"md:block w-full truncate\">You</span>\n  <h3 class=\"md:inline-block mt-3 md:mt-0 text-2xl text-gray-600 font-light leading-1.3\">\n    \n    Browser\n    \n  </h3>\n  <span class=\"leading-1.3 text-2xl text-green-success\">Working</span>\n</div>\n\n<div id=\"cf-cloudflare-status\" class=\" relative w-1/3 md:w-full py-15 md:p-0 md:py-8 md:text-left md:border-solid md:border-0 md:border-b md:border-gray-400 overflow-hidden float-left md:float-none text-center\">\n  <div class=\"relative mb-10 md:m-0\">\n    <a href=\"https://www.cloudflare.com/5xx-error-landing?utm_source=errorcode_504&utm_campaign=vulavula-services.lelapa.ai\" target=\"_blank\" rel=\"noopener noreferrer\">\n    <span class=\"cf-icon-cloud block md:hidden h-20 bg-center bg-no-repeat\"></span>\n    <span class=\"cf-icon-ok w-12 h-12 absolute left-1/2 md:left-auto md:right-0 md:top-0 -ml-6 -bottom-4\"></span>\n    </a>\n  </div>\n  <span class=\"md:block w-full truncate\">Atlanta</span>\n  <h3 class=\"md:inline-block mt-3 md:mt-0 text-2xl text-gray-600 font-light leading-1.3\">\n    <a href=\"https://www.cloudflare.com/5xx-error-landing?utm_source=errorcode_504&utm_campaign=vulavula-services.lelapa.ai\" target=\"_blank\" rel=\"noopener noreferrer\">\n    Cloudflare\n    </a>\n  </h3>\n  <span class=\"leading-1.3 text-2xl text-green-success\">Working</span>\n</div>\n\n<div id=\"cf-host-status\" class=\"cf-error-source relative w-1/3 md:w-full py-15 md:p-0 md:py-8 md:text-left md:border-solid md:border-0 md:border-b md:border-gray-400 overflow-hidden float-left md:float-none text-center\">\n  <div class=\"relative mb-10 md:m-0\">\n    \n    <span class=\"cf-icon-server block md:hidden h-20 bg-center bg-no-repeat\"></span>\n    <span class=\"cf-icon-error w-12 h-12 absolute left-1/2 md:left-auto md:right-0 md:top-0 -ml-6 -bottom-4\"></span>\n    \n  </div>\n  <span class=\"md:block w-full truncate\">vulavula-services.lelapa.ai</span>\n  <h3 class=\"md:inline-block mt-3 md:mt-0 text-2xl text-gray-600 font-light leading-1.3\">\n    \n    Host\n    \n  </h3>\n  <span class=\"leading-1.3 text-2xl text-red-error\">Error</span>\n</div>\n\n                </div>\n            </div>\n        </div>\n\n        <div class=\"w-240 lg:w-full mx-auto mb-8 lg:px-8\">\n            <div class=\"clearfix\">\n                <div class=\"w-1/2 md:w-full float-left pr-6 md:pb-10 md:pr-0 leading-relaxed\">\n                    <h2 class=\"text-3xl font-normal leading-1.3 mb-4\">What happened?</h2>\n                    <p>The web server reported a gateway time-out error.</p>\n                </div>\n                <div class=\"w-1/2 md:w-full float-left leading-relaxed\">\n                    <h2 class=\"text-3xl font-normal leading-1.3 mb-4\">What can I do?</h2>\n                    <p class=\"mb-6\">Please try again in a few minutes.</p>\n                </div>\n            </div>\n        </div>\n\n        <div class=\"cf-error-footer cf-wrapper w-240 lg:w-full py-10 sm:py-4 sm:px-8 mx-auto text-center sm:text-left border-solid border-0 border-t border-gray-300\">\n  <p class=\"text-13\">\n    <span class=\"cf-footer-item sm:block sm:mb-1\">Cloudflare Ray ID: <strong class=\"font-semibold\">8b7b3821195ead68</strong></span>\n    <span class=\"cf-footer-separator sm:hidden\">&bull;</span>\n    <span id=\"cf-footer-item-ip\" class=\"cf-footer-item hidden sm:block sm:mb-1\">\n      Your IP:\n      <button type=\"button\" id=\"cf-footer-ip-reveal\" class=\"cf-footer-ip-reveal-btn\">Click to reveal</button>\n      <span class=\"hidden\" id=\"cf-footer-ip\">34.23.212.59</span>\n      <span class=\"cf-footer-separator sm:hidden\">&bull;</span>\n    </span>\n    <span class=\"cf-footer-item sm:block sm:mb-1\"><span>Performance &amp; security by</span> <a rel=\"noopener noreferrer\" href=\"https://www.cloudflare.com/5xx-error-landing?utm_source=errorcode_504&utm_campaign=vulavula-services.lelapa.ai\" id=\"brand_link\" target=\"_blank\">Cloudflare</a></span>\n    \n  </p>\n  <script>(function(){function d(){var b=a.getElementById(\"cf-footer-item-ip\"),c=a.getElementById(\"cf-footer-ip-reveal\");b&&\"classList\"in b&&(b.classList.remove(\"hidden\"),c.addEventListener(\"click\",function(){c.classList.add(\"hidden\");a.getElementById(\"cf-footer-ip\").classList.remove(\"hidden\")}))}var a=document;document.addEventListener&&a.addEventListener(\"DOMContentLoaded\",d)})();</script>\n</div><!-- /.error-footer -->\n\n\n    </div>\n</div>\n</body>\n</html>\n"}

# Vulavula Search

### Multilingual Embeddings Search

🛠 Feature: Vulavula Search

👅 Languages: All languages


Embedding search is a powerful technique that converts text into numerical vectors, capturing semantic meaning for efficient retrieval. In multilingual communication, embedding models trained on diverse languages can map the meaning of text into a shared space, enabling cross-language searches. For instance, in a customer support system, a query in one language can retrieve relevant answers stored in different languages, ensuring  access to information. This makes embedding search an invaluable tool for applications like FAQs, where users can find accurate and contextually relevant information regardless of language barriers.

To get started create a base:

In [None]:
knowledge_base_result = client.create_knowledgebase("<knowledge base name>")
print("Knowledge Base Creation Result:", knowledge_base_result)

Retrieve all existing knowledge bases.

In [None]:
knowledgebases = client.get_knowledgebases()
print("Knowledge Bases:", knowledgebases)

You can also delete a knowledge base

In [None]:
delete_result = client.delete_knowledgebase("<knowledgebase id>")
print("Delete Result:", delete_result)

{'error': 'Vulavula errors - Details: {"error": true, "status_code": 404, "message": "API error: 404", "details": "{\\"detail\\":\\"Not Found\\"}"}'}

Upload a file and extract text to create documents within a collection.


In [None]:
result = client.create_documents("<document name>.pdf", "<knowledgebase id>")
print("Upload and Extract Result:", result)

Retrieve all documents within a specified knowledge base.

In [None]:
result = client.get_documents("<knowledgebase id>")
print(result)

Delete a specific document from a knowledge base.

In [None]:
delete_result = client.delete_document("<document id>")
print("Delete Result:", delete_result)

Perform a search query within a specific knowledge base and language.

In [None]:
query = "<INSERT QUESTION TO QUERY YOUR DOCUMENT>"

In [None]:
result = client.query(knowledgebase_id="<knowledgebase id>", query=query, language="en_Us")
print(result)


## Congratulations 🎉

 You have engaged with all the Vulavula features! Please do not hestitate to contact us at vulavulasupport@lelapa.ai if you have any questions! Happy buidling!