# OCI language 

Small Language Models can be cheaper and better for arious languge tasks. 

1. sentiment analysius
1. key phrase extraction
1. named entity extraction 
1. text classification
1. PII identification & masking 
1. language detection
1. translation

see sdk at  at : https://github.com/oracle/oci-python-sdk/blob/22fd62c8dbbd1aaed6b75754ec1ba8a3c16a4e5a/src/oci/ai_language/ai_service_language_client.py#L584
documentation at : https://docs.oracle.com/en-us/iaas/language/using/home.htm

## Sentiment Analysis

Aspects:  Topics or concepts withn in text
Sentences:  sentiment for the entire sentences

https://docs.oracle.com/en-us/iaas/language/using/sentment.htm

In [5]:
import oci, os, json 


#####
#make sure your sandbox.json file is setup for your environment. You might have to specify the full path depending on  your `cwd` 
#####
SANDBOX_CONFIG_FILE = "sandbox.json"

scfg = None
# read the sandbox config 
with open(os.path.expanduser(SANDBOX_CONFIG_FILE), 'r') as f:
                scfg=  json.load(f)
                
#read the oci config
config = oci.config.from_file(os.path.expanduser(scfg["oci"]["configFile"]),scfg["oci"]["profile"])

compartmentId=  scfg["oci"]["compartment"]


lang_client = oci.ai_language.AIServiceLanguageClient(config)


test_string1 = """
    Oracle Cloud Infrastructure is built for enterprises seeking higher performance, lower costs, and easier cloud migration for their applications. 
    Customers choose Oracle Cloud Infrastructure over AWS for several reasons:
    First, they can consume cloud services in the public cloud or within their  own data center with Oracle Dedicated Region Cloud@Customer. 
    Second, they can migrate and run any workload as is on Oracle Cloud, including Oracle databases and applications, VMware, or bare metal servers. 
    Third, customers can easily implement security controls and automation to prevent misconfiguration errors and implement security best practices. 
    Fourth, they have lower risks with Oracle’s end-to-end SLAs covering performance, availability, and manageability of services. 
    Finally, their workloads achieve better performance at a significantly lower cost with Oracle Cloud Infrastructure than AWS.
    
    Take a look at what makes Oracle Cloud Infrastructure a better cloud platform than AWS."
"""
test_string2 = " The restaurant Chinese Garden on 100  Broadway, Denver, CO-80503 serves delicious meal, but the food can be expensive."
test_string3 = " The wet, slushy rain in Denver can lead to accidents, but if yuo send an emai to help@denver.org they will come out and help which is awesome"

test_doc1 = oci.ai_language.models.TextDocument(
        key="oci",
        text=test_string1,
        language_code="en"
        )

test_doc2 = oci.ai_language.models.TextDocument(
        key="chinese_garden",
        text=test_string2,
        language_code="en"
        )
test_doc3 = oci.ai_language.models.TextDocument(
        key="Denver",
        text=test_string3,
        language_code="en"
        )

test_docs=[test_doc1, test_doc2, test_doc3]

In [None]:
#Sentiment analysis - aspect

senti_details =oci.ai_language.models.BatchDetectLanguageSentimentsDetails(
    documents = test_docs,
    compartment_id = compartmentId)

senti_res = lang_client.batch_detect_language_sentiments(batch_detect_language_sentiments_details=senti_details , level=["ASPECT"] )

print (senti_res.data)

In [None]:
#Sentiment analysis - sentence

senti_details =oci.ai_language.models.BatchDetectLanguageSentimentsDetails(
    documents=test_docs,
    compartment_id = compartmentId)

senti_res = lang_client.batch_detect_language_sentiments(batch_detect_language_sentiments_details=senti_details , level=["SENTENCE"] )

print (senti_res.data)

## Key Phrase Extraction

In [None]:
keyphrase_extraction = lang_client.batch_detect_language_key_phrases(
            batch_detect_language_key_phrases_details=oci.ai_language.models.BatchDetectLanguageKeyPhrasesDetails(documents=test_docs,compartment_id = compartmentId)
        )
         
print(keyphrase_extraction.data)

## Named Entity Extractions

see : https://docs.oracle.com/en-us/iaas/language/using/ner.htm

In [None]:
ner_extraction = lang_client.batch_detect_language_entities(
            batch_detect_language_entities_details=oci.ai_language.models.BatchDetectLanguageEntitiesDetails(documents=test_docs,compartment_id = compartmentId)
        )
         
print(ner_extraction.data)

## Text Classification

see: https://docs.oracle.com/en-us/iaas/language/using/text-class.htm


In [None]:
# Run text classification on text_document
text_classification = lang_client.batch_detect_language_text_classification(
            batch_detect_language_text_classification_details=oci.ai_language.models.BatchDetectLanguageTextClassificationDetails(
                documents=test_docs,compartment_id = compartmentId
            )
        )
print(text_classification.data)

# PII Identification

see: https://docs.oracle.com/en-us/iaas/language/using/pii.htm

In [None]:


piiEntityMasking = oci.ai_language.models.PiiEntityMask(mode="MASK", masking_character="*", leave_characters_unmasked=4,
                                                        is_unmasked_from_end=True)
masking = {"ALL": piiEntityMasking}
pii_identification = lang_client.batch_detect_language_pii_entities(
            batch_detect_language_pii_entities_details=oci.ai_language.models.BatchDetectLanguagePiiEntitiesDetails(
                documents=test_docs,compartment_id = compartmentId,
                masking = masking
            )
        )
print (pii_identification.data)

# Language detection 
https://docs.oracle.com/en-us/iaas/language/using/lang-detect.htm


In [None]:
# AI Service : Language detection


lang_doc1 = oci.ai_language.models.DominantLanguageDocument(
        key="french",
        text="Et encore une autre langue, es-possible qu'il le comprend ?",
        )

lang_doc2 = oci.ai_language.models.DominantLanguageDocument(
        key="dutch",
        text="Een tekst in mijn moedertaal om het een beetje moeilijker te maken voor de service",
        )
lang_doc3 = oci.ai_language.models.DominantLanguageDocument(
        key="english",
        text="This should be fairly easy to detect, I'll avoid using the name of the actual language in this text",
        )

lang_docs=[lang_doc1, lang_doc2, lang_doc3]

response = lang_client.batch_detect_dominant_language (batch_detect_dominant_language_details=
        oci.ai_language.models.BatchDetectDominantLanguageDetails(documents=lang_docs,compartment_id = compartmentId)
    )

print(response.data)


# Translation
https://docs.oracle.com/en-us/iaas/language/using/translate-text.htm

In [None]:
# Translate a few sentences from English to Dutch.  Feel free to change the text or the languages


key1 = "doc1"
key2 = "doc2"
text1 = "The Indy Autonomous Challenge is the worlds first head-to-head, high speed autonomous race taking place at the Indianapolis Motor Speedway"
text2 = "OCI will be the cloud engine for the artificial intelligence models that drive the MIT Driverless cars."
target_language = "nl" #TODO specify the target language

doc1 = oci.ai_language.models.TextDocument(key=key1, text=text1, language_code="en")
doc2 = oci.ai_language.models.TextDocument(key=key2, text=text2, language_code="en")
documents = [doc1, doc2]


batch_language_translation_details = oci.ai_language.models.BatchLanguageTranslationDetails(
    documents=documents, 
    compartment_id=compartmentId, 
    target_language_code=target_language)
output = lang_client.batch_language_translation (batch_language_translation_details)
print(output.data)