### OCI Data Science - Useful Tips
<details>
<summary><font size="2">Check for Public Internet Access</font></summary>

```python
import requests
response = requests.get("https://oracle.com")
assert response.status_code==200, "Internet connection failed"
```
</details>
<details>
<summary><font size="2">Helpful Documentation </font></summary>
<ul><li><a href="https://docs.cloud.oracle.com/en-us/iaas/data-science/using/data-science.htm">Data Science Service Documentation</a></li>
<li><a href="https://docs.cloud.oracle.com/iaas/tools/ads-sdk/latest/index.html">ADS documentation</a></li>
</ul>
</details>
<details>
<summary><font size="2">Typical Cell Imports and Settings for ADS</font></summary>

```python
%load_ext autoreload
%autoreload 2
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

import logging
logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.ERROR)

import ads
from ads.dataset.factory import DatasetFactory
from ads.automl.provider import OracleAutoMLProvider
from ads.automl.driver import AutoML
from ads.evaluations.evaluator import ADSEvaluator
from ads.common.data import ADSData
from ads.explanations.explainer import ADSExplainer
from ads.explanations.mlx_global_explainer import MLXGlobalExplainer
from ads.explanations.mlx_local_explainer import MLXLocalExplainer
from ads.catalog.model import ModelCatalog
from ads.common.model_artifact import ModelArtifact
```
</details>
<details>
<summary><font size="2">Useful Environment Variables</font></summary>

```python
import os
print(os.environ["NB_SESSION_COMPARTMENT_OCID"])
print(os.environ["PROJECT_OCID"])
print(os.environ["USER_OCID"])
print(os.environ["TENANCY_OCID"])
print(os.environ["NB_REGION"])
```
</details>

In [None]:
from pyrdf2vec.graphs import KG
from pyrdf2vec.walkers import RandomWalker
import pandas as pd
import time

In [None]:
# Initialize the knowledge graph (KG) object with the RDF data
kg=KG(
        "recipekg_1-4.ttl",
        skip_predicates={},
    )

In [None]:
df = pd.read_csv("recipe_uris_1-4.csv")
# Extract URIs from the 'Recipe URI' column and convert them into a list
recipes = df['recipes'].tolist()

In [None]:
# Initialize lists to store the number of paths and the runtime for each iteration
paths_nums = []
run_times = []

# Iterate through the entities list in increments of 10
for i in range(10, len(recipes), 1000):
    
    # Initialize the RandomWalker with a maximum depth of 4
    walker = RandomWalker(max_depth=4)
    
    # Measure the time taken to extract walks for the current set of entities
    start_time = time.time()
    walks = walker.extract(kg, recipes[:i])
    end_time = time.time()
    
    # Calculate the runtime and round it to 3 decimal places
    runtime = round(end_time - start_time, 3)
    run_times.append(runtime)  # Store the runtime
    
    # Calculate the total number of paths generated and store it
    paths_nums.append(sum(len(sublist) for sublist in walks))
    
    # Print the current iteration and its corresponding runtime
    print(i, ':', runtime)
    # Print the list of runtimes accumulated so far
    print(run_times)