### OCI Data Science - Useful Tips
<details>
<summary><font size="2">Check for Public Internet Access</font></summary>

```python
import requests
response = requests.get("https://oracle.com")
assert response.status_code==200, "Internet connection failed"
```
</details>
<details>
<summary><font size="2">Helpful Documentation </font></summary>
<ul><li><a href="https://docs.cloud.oracle.com/en-us/iaas/data-science/using/data-science.htm">Data Science Service Documentation</a></li>
<li><a href="https://docs.cloud.oracle.com/iaas/tools/ads-sdk/latest/index.html">ADS documentation</a></li>
</ul>
</details>
<details>
<summary><font size="2">Typical Cell Imports and Settings for ADS</font></summary>

```python
%load_ext autoreload
%autoreload 2
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

import logging
logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.ERROR)

import ads
from ads.dataset.factory import DatasetFactory
from ads.automl.provider import OracleAutoMLProvider
from ads.automl.driver import AutoML
from ads.evaluations.evaluator import ADSEvaluator
from ads.common.data import ADSData
from ads.explanations.explainer import ADSExplainer
from ads.explanations.mlx_global_explainer import MLXGlobalExplainer
from ads.explanations.mlx_local_explainer import MLXLocalExplainer
from ads.catalog.model import ModelCatalog
from ads.common.model_artifact import ModelArtifact
```
</details>
<details>
<summary><font size="2">Useful Environment Variables</font></summary>

```python
import os
print(os.environ["NB_SESSION_COMPARTMENT_OCID"])
print(os.environ["PROJECT_OCID"])
print(os.environ["USER_OCID"])
print(os.environ["TENANCY_OCID"])
print(os.environ["NB_REGION"])
```
</details>

In [None]:
##Langchain deployment

In [9]:
import tempfile
import ads
from ads.model.generic_model import GenericModel
from config import CONDA_PACK_PATH, LOG_GROUP_ID, LANGCHAIN_MODEL_ACCESS_LOG_LOG_ID, LANGCHAIN_MODEL_PREDICT_LOG_LOG_ID

 
ads.set_auth("resource_principal")
 
langchain_model = GenericModel( artifact_dir="langchain_nl2sql_model",estimator=None, serialize=False)
langchain_model.summary_status()



Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Actions Needed
Step,Status,Details,Unnamed: 3_level_1
initiate,Done,Initiated the model,
prepare(),Available,Generated runtime.yaml,
prepare(),Available,Generated score.py,
prepare(),Available,Serialized model,
prepare(),Available,"Populated metadata(Custom, Taxonomy and Provenance)",
verify(),Not Available,Local tested .predict from score.py,
save(),Not Available,Conducted Introspect Test,
save(),Not Available,Uploaded artifact to model catalog,
deploy(),UNKNOWN,Deployed the model,
predict(),Not Available,Called deployment predict endpoint,


In [10]:
#Copy required python scripts to artifact dir
#mkdir langchain_nl2sql_model
#cp config.py config_private.py oci_utils.py oracle_vector_db.py langchain_nl2sql_model/
langchain_model.prepare(
        inference_conda_env=CONDA_PACK_PATH,
        inference_python_version = "3.9",
        model_file_name="test",
        score_py_uri= "langchain_nl2sql_model_score.py",
        force_overwrite=True
     )

INFO:ADS:To auto-extract taxonomy metadata the model must be provided. Supported models: keras, lightgbm, pytorch, sklearn, tensorflow, pyspark, and xgboost.


algorithm: null
artifact_dir:
  /home/datascience/langchain_nl2sql_model:
  - - oracle_vector_db.py
    - config.py
    - config_private.py
    - runtime.yaml
    - test_json_output.json
    - score.py
    - .model-ignore
    - oci_utils.py
framework: null
model_deployment_id: null
model_id: null

In [11]:
op=langchain_model.verify('Are there partners that was nominated after the permitted nomination period?')


<class 'str'>
###Query Are there partners that was nominated after the permitted nomination period?
Inside
Inside1
Inside2
INFO:ConsoleLogger:SQL Query: select V.id, C.CHUNK, C.PAGE_NUM,
                            VECTOR_DISTANCE(V.VEC, :1, COSINE) as d,
                            B.NAME 
                            from VECTORS V, CHUNKS C, DOCUMENTS B
                            where C.ID = V.ID and
                            C.DOCUMENTS_ID = B.ID
                            order by d
                            FETCH  FIRST 2 ROWS ONLY
INFO:ConsoleLogger:Query duration: 0.0 sec.
###Reranker Result Max value: -5.125233173370361
INFO:langchain_community.llms.oci_data_science_model_deployment_endpoint:LLM API Request:
 Given an input Question, create a syntactically correct Oracle SQL query to run. 
Pay attention to using only the column names that you can see in the schema description.
Be careful to not query for columns that do not exist. Also, pay attention to which column is i

In [12]:
print(op)

{'prediction': ' There are two partners, Partner1 and TestPartner, who were nominated after the permitted nomination period (2024-07-01 and 2024-07-10 respectively).'}


In [13]:
model_id = langchain_model.save(display_name="langchain-nl2sql-model")

['oracle_vector_db.py', 'config.py', 'config_private.py', 'runtime.yaml', 'test_json_output.json', 'score.py', '.model-ignore', 'oci_utils.py']


loop1:   0%|          | 0/4 [00:00<?, ?it/s]

In [14]:
deploy = langchain_model.deploy(
    display_name="Langchain NL2SQL Model Deployment",
    deployment_log_group_id = LOG_GROUP_ID,
    deployment_access_log_id = LANGCHAIN_MODEL_ACCESS_LOG_LOG_ID,
    deployment_predict_log_id = LANGCHAIN_MODEL_PREDICT_LOG_LOG_ID,
    environment_variables={"CRYPTOGRAPHY_OPENSSL_NO_LEGACY":"1"},
    deployment_instance_shape="VM.Standard2.4",
    deployment_instance_subnet_id="ocid1.subnet.oc1.eu-frankfurt-1.<ocid>",
)

loop1:   0%|          | 0/6 [00:00<?, ?it/s]

In [15]:
question = "Is there any partner with more than 10 customers and customer satisfaction higher than 4.5?"
op = deploy.predict(question)



In [16]:
print(op)

{'prediction': ' Partner1 and Partner2 both have more than 10 customers and customer satisfaction rating higher than 4.5.'}


In [None]:
question = "Are there partners that was nominated after the permitted nomination period?"
op = deploy.predict(question)
print(op)


