# Getting Started with the Data Science Service

Data Science service uses [conda](https://anaconda.org/) environments to manage python dependencies.

[![Notebook Examples](https://img.shields.io/badge/docs-notebook--examples-blue)](https://github.com/oracle-samples/oci-data-science-ai-samples/tree/master/notebook_examples)
[![Conda Environments](https://img.shields.io/badge/docs-conda--environments-blue)](https://docs.oracle.com/en-us/iaas/data-science/using/conda_understand_environments.htm)
[![Source Code](https://img.shields.io/badge/source-accelerated--datascience-blue)](https://github.com/oracle/accelerated-data-science)

## Upgrade Accelerated Data Science SDK - `oracle-ads`

The Oracle Accelerated Data Science (ADS) SDK is maintained by the Oracle Cloud Infrastructure Data Science service team. It speeds up common data science activities by providing tools that automate and/or simplify common data science tasks, along with providing a data scientist friendly pythonic interface to Oracle Cloud Infrastructure (OCI) services, most notably OCI Data Science, Data Flow, Object Storage, and the Autonomous Database. ADS gives you an interface to manage the lifecycle of machine learning models, from data acquisition to model evaluation, interpretation, and model deployment.

Before you begin with a conda environment, upgrade `oracle-ads` library - [![PyPI](https://img.shields.io/pypi/v/oracle-ads.svg)](https://pypi.org/project/oracle-ads/)  [![Python](https://img.shields.io/pypi/pyversions/oracle-ads.svg?style=plastic)](https://pypi.org/project/oracle-ads/)


You can check your version of `oracle-ads` by running - 

In [None]:
import ads

print(ads.__version__)

In [13]:
# Uncomment this code and set the correct proxy links if have to setup proxy for internet
# import os
# os.environ['http_proxy']="http://myproxy"
# os.environ['https_proxy']="http://myproxy"

# Use os.environ['no_proxy'] to route trafic directly

In [None]:
# To upgrade run -
! pip install oracle-ads --upgrade

## Authentication
To interact with oci services you need to authenticate with one of the following mechanism - 

### 1. Resource Principal

Resource Principal works by authorizing the notebook instance that you are using to read/manage OCI service resource such as Object Storage, Data Science Jobs, Data Science Models, Data Science Model Deployment, etc. Check these references - 
    
- Refer how to setup policy for managing Data science service resource [here](https://docs.oracle.com/en-us/iaas/data-science/using/policies.htm)
- Refer how to setup policy for managing Object Storage service resource [here](https://docs.oracle.com/en-us/iaas/Content/Identity/policiescommon/commonpolicies.htm#write-objects-to-buckets)
    
    
Other useful resources - 

- https://docs.oracle.com/en-us/iaas/Content/Identity/Concepts/commonpolicies.htm
- https://docs.oracle.com/en-us/iaas/Content/Identity/Concepts/policygetstarted.htm#Getting_Started_with_Policies

Once the policies are setup, configure `oracle-ads` to use resource principal as follows - 


```python
ads.set_auth('resource_principal')
```

### 2. API Key

To setup API Key refer - 

- https://docs.oracle.com/en-us/iaas/Content/API/Concepts/apisigningkey.htm
- https://docs.oracle.com/en-us/iaas/Content/API/Concepts/sdkconfig.htm


Once you have setup the config and the keys, you can setup ads to use API Keys - 

```python

ads.set_auth('api_key')

```

## Working with Data on Object Storage

In [14]:
import ads
import pandas as pd

ads.set_auth("resource_principal")

In [15]:
bucket_name = "hosted-ds-datasets"
namespace = "bigdatadatasciencelarge"


file_name = "titanic/titanic.csv"
df = pd.read_csv(
    f"oci://{bucket_name}@{namespace}/{file_name}",
    storage_options=ads.common.auth.default_signer(),
)

In [16]:
df.head()

Unnamed: 0,Survived,Pclass,Name,Sex,Age,Siblings/Spouses Aboard,Parents/Children Aboard,Fare
0,0,3,Mr. Owen Harris Braund,male,22.0,1,0,7.25
1,1,1,Mrs. John Bradley (Florence Briggs Thayer) Cum...,female,38.0,1,0,71.2833
2,1,3,Miss. Laina Heikkinen,female,26.0,0,0,7.925
3,1,1,Mrs. Jacques Heath (Lily May Peel) Futrelle,female,35.0,1,0,53.1
4,0,3,Mr. William Henry Allen,male,35.0,0,0,8.05


### Working with other sources

Learn how to work with other sources [here](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/loading_data/connect.html)

## References

* [Oracle Accelerated Data Science SDK Guide](https://accelerated-data-science.readthedocs.io/en/latest/)
* [Oracle Accelerated Data Science Source Code](https://github.com/oracle/accelerated-data-science)
* [Notebook Examples](https://github.com/oracle-samples/oci-data-science-ai-samples/tree/master/notebook_examples)
* [Conda environments](https://docs.oracle.com/en-us/iaas/data-science/using/conda_understand_environments.htm)
* [Publish Conda Environments](https://docs.oracle.com/en-us/iaas/data-science/using/conda_publishs_object.htm)