In [1]:
#conda install python-dotenv
#pip install "openai<1" to install the older version of the OpenAI API we will upgrade later
import openai 
from dotenv import dotenv_values

###### **Secure your API Key**: To safeguard your OpenAI API key, it's recommended to keep it in an environment variable within your .env file. This way, the key remains hidden and not plainly visible within your Jupyter notebook.

In [9]:
# getting  OPENAI_API_KEY  from the .env file 
config = dotenv_values(".env")
openai.api_key = config["OPENAI_API_KEY"]


###### **Helper Function** To call the function get_completion() to reuse the call to the openaai api endpoint
example :
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

In [10]:

''' What is temperature=0 ?
Higher values like 0.8 will make the output more random, 
while lower values like 0.2 will make it more focused and deterministic.
Deterministic Output: With temperature set to 0, the model's output 
becomes completely deterministic. This means that for a given input or prompt, the model 
will always produce the same output, no matter how many times you generate a response. 
'''
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0,
        
    )
    return response.choices[0].message["content"]


'''In order to use the OpenAI library version 1.0.0, here is the code that you would use instead for the get_completion function:

client = openai.OpenAI()

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content
''

#### Prompting Principles
###### **Principle 1** - Articulate Precise and Detailed Directives: Ensure that your instructions are articulated with clarity and precision, leaving no room for ambiguity.
###### **Principle 2** - Allocate Adequate Processing Time: Allow sufficient time for the model to process and 'ponder' over the given instructions for optimal outcomes.

###### Applying Principle 1   Delimiters can be anything like: ```, """, < >, <tag> </tag>, :
###### Applying Principle 1   Ask for strucutred output LIST, JSON HTML:
###### Applying Principle 1   Check the answers with the condition to make sure it is not making wild assumptions :
###### Applying Principle 1   Few-shot promting, providing the model with few example so it can understand the task :


In [11]:
# Using clear Delimiters in the text
text = f"""
Snowflake’s architecture is a hybrid of traditional \
shared-disk and shared-nothing database architectures. 
Similar to shared-disk architectures, Snowflake uses a \
central data repository for persisted data that is accessible from all compute nodes in the platform.\
But similar to shared-nothing architectures, Snowflake processes queries using MPP (massively parallel processing) \
compute clusters where each node in the cluster stores a portion of the entire data set locally. This approach offers \
the data management simplicity \
of a shared-disk architecture, but with the performance and scale-out benefits of a shared-nothing architecture.
"""
prompt = f"""
Summarize the text delimited by triple backticks \ 
into a single sentence.
```{text}```
"""
response = get_completion(prompt)
print(response)

Snowflake's architecture combines elements of both shared-disk and shared-nothing database architectures, utilizing a central data repository for persisted data that is accessible from all compute nodes, while also processing queries using MPP compute clusters where each node stores a portion of the data set locally, providing the simplicity of a shared-disk architecture with the performance and scalability advantages of a shared-nothing architecture.


In [12]:
# Displaying in strcutured output
prompt = f"""
Generate a list of three psychology books  from 1800s along \ 
with their authors and  first year of pulication  . 
Generate t them in JSON format with the following keys: 
ISBN, title, author, year.
"""
response = get_completion(prompt)
print(response)

[
  {
    "ISBN": "978-0486203815",
    "title": "The Interpretation of Dreams",
    "author": "Sigmund Freud",
    "year": 1899
  },
  {
    "ISBN": "978-0486417818",
    "title": "The Principles of Psychology",
    "author": "William James",
    "year": 1890
  },
  {
    "ISBN": "978-0486206212",
    "title": "Psychopathology of Everyday Life",
    "author": "Sigmund Freud",
    "year": 1901
  }
]


In [13]:
# Checking for conditions 

text=f"""Loading data into Snowflake involves several key steps.\
First, prepare your data in a compatible format, such as CSV, JSON, \
Parquet, or Avro, and ensure it's clean and structured. Store your data in a \
cloud storage solution like Amazon S3, Google Cloud Storage, or Azure Blob Storage, \
and set up an external stage in Snowflake to link to this location. Define a file format in \
Snowflake to interpret the data files correctly. Create a database and schema within Snowflake, \
followed by a table with a schema matching your data. Use the COPY INTO command to transfer data from the stage \
to your Snowflake table, specifying the file format and handling options. Finally, validate the loaded data by executing \
queries and checking for any discrepancies to ensure a successful data transfer into Snowflake's cloud-based data warehousing \
environment."""
prompt =f""" Based on the instruction provided in the text to load data into snowflake, provide the step by step instruction in the following format:
STEP 1 -
STEP 2 -
STEP N -

Make each step heading in bold
if the instruction provided in the text doesn't contain sequence of instructions then write "No instruction provided"
```{text}```
"""

response = get_completion(prompt)
print(response)

**STEP 1 - Prepare your data in a compatible format**
- Ensure your data is clean and structured
- Convert your data into a compatible format such as CSV, JSON, Parquet, or Avro

**STEP 2 - Store your data in a cloud storage solution**
- Choose a cloud storage solution like Amazon S3, Google Cloud Storage, or Azure Blob Storage
- Upload your data files to the chosen cloud storage location

**STEP 3 - Set up an external stage in Snowflake**
- Create an external stage in Snowflake to link to the cloud storage location where your data is stored
- Configure the stage to specify the cloud storage provider, location, and credentials

**STEP 4 - Define a file format in Snowflake**
- Define a file format in Snowflake that matches the format of your data files
- Specify the file format options such as field delimiter, record delimiter, and data type mappings

**STEP 5 - Create a database and schema in Snowflake**
- Create a database in Snowflake to hold your data
- Create a schema within the da

In [14]:
#Few-Shot Promting 
prompt=f"""Your task is to answer abour snowflake cloud platform.
<User> :  What is Snowflake Computing?
<Expert>:Snowflake, available on AWS, Azure, and Google Cloud,\
offers Data Warehouse-as-a-Service, handling vast structured or semi-structured data \
volumes with minimal effort. Its unique selling point isn't just being a SaaS data warehouse; \
it’s the specific features and efficiencies that set it apart from competitors
<User> : what is multi-cloud platform?
"""
response = get_completion(prompt)
print(response)

<Expert>: A multi-cloud platform refers to the use of multiple cloud computing services from different providers, such as AWS, Azure, and Google Cloud, to meet an organization's needs. It allows businesses to leverage the strengths and capabilities of different cloud providers, avoiding vendor lock-in and increasing flexibility and resilience. With a multi-cloud approach, organizations can distribute workloads across different clouds, optimize costs, and take advantage of specific services offered by each provider.
