<a href="https://colab.research.google.com/github/sanja7s/LLM_uses_AI_technology/blob/main/Reality_check_Prompt.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup
#### Load the API key and relevant Python libaries.

In [None]:
!pip install langchain
!pip install faiss-cpu
!pip install openai
!pip install unstructured
!pip install python-dotenv
!pip install tiktoken



In [None]:
from google.colab import files
import io
from dotenv import dotenv_values, load_dotenv, find_dotenv
import openai
import os
import json
import re
import ast
from itertools import islice
import time
from copy import deepcopy

uploaded = files.upload()

Saving env to env (1)


In [None]:
# Get the first key from the uploaded dictionary
env_file_key = list(uploaded.keys())[0]

# Read the uploaded file
env_content = uploaded[env_file_key].decode('utf-8')

# Load the content into a variable
env_variables = dotenv_values(stream=io.StringIO(env_content))

api_key = env_variables['OPENAI_API_KEY']
openai.api_key = api_key

# Models

In [None]:
def get_completion_from_messages(messages,
                                 model="gpt-4",
                                 temperature=0,
                                 max_tokens=6000):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, # this is the degree of randomness of the model's output
        max_tokens=max_tokens, # the maximum number of tokens the model can ouptut
    )
    return response.choices[0].message["content"]

In [None]:
def get_completion_and_token_count(messages,
                                 model="gpt-4",
                                 temperature=0,
                                 max_tokens=4500):

    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_tokens,
    )

    content = response.choices[0].message["content"]

    token_dict = {
    'prompt_tokens':response['usage']['prompt_tokens'],
    'completion_tokens':response['usage']['completion_tokens'],
    'total_tokens':response['usage']['total_tokens'],
    }

    return content, token_dict

## Reality-Check Prompt

In [None]:
system_message = """As an AI Technology Specialist and Evaluator, your dedication to precision and accuracy drives a meticulous process of categorizing diverse applications of AI technology.
Your methodology centers on conducting exhaustive research and analysis, yielding comprehensive insights that play a pivotal role in enhancing the comprehension and systematic categorization of AI technology."""

user_message = """Categorize the uses of facial recognition technology listed in List A into three specific categories: Already existent uses, Upcoming uses, and Unlikely uses. Provide a one-sentence justification for each categorization.
Keep in mind that any applications should be assessed considering their implementation through facial recognition technology.
Take account of the following definitions of the three categories when categorizing the uses of AI technology in List A:

1. Already existent uses of facial recognition technology: encompass uses that are currently implemented and well-established uses.
2. Upcoming uses of facial recognition technology: encompass uses that are currently under development, being researched, or subject to discussions. So far, these uses have either not been implemented or have been severely limited in practice due to various reasons.
3. Unlikely uses of facial recognition technology encompass: uses that lack value, usability, applicability, or practicality, or are deemed unnecessary, impossible, incoherent, or unrealistic.

Follow the provided example structure when reporting the categorizations and judgments of the uses listed in List A:
Example structure:
{
    "Categorizations and Judgements": [
        "1: Already existent. Virtual assistants like Siri, Google Assistant, and Alexa are widely used to perform tasks, answer questions, set reminders, and control smart devices.",
        "2: Upcoming. Facial recognition technology has the potential to revolutionize healthcare through medical condition diagnosis, yet its successful integration depends on resolving privacy, regulatory, and trust-related issues among medical practitioners and patients.",
        "3: Unlikely. Controlling or manipulating a person's dreams would be considered unrealistic and raises ethical concerns about invading the individual's subconscious without their consent."
        "4: Already existent. AI-driven recommendation systems, as seen in platforms like Netflix, already utilize user data to suggest personalized content based on viewing preferences and habits."
        "5: Unlikely. Facial recognition technology is designed for identifying human features and lacks the necessary parameters for effectively analyzing plant health and disease."
        "6: Upcoming. Autonomous vehicles represent an upcoming AI use, driven by ongoing research for safer and more efficient self-driving capabilities through advanced algorithms and real-time data analysis."
    ],
    "Summary": {
        "Already existent": [1, [1]],
        "Upcoming": [1, [2]],
        "Unlikely": [1, [3]],
    }
}
####
List A:
"""

## Functions

In [None]:
# Function 1

def combine_results(res1, res2):

  res = {}
  res["Summary"] = {"Already existent" : [],
                    "Upcoming" : [],
                    "Unlikely" : []
                    }
  res["Categorizations and Judgements"] = res1["Categorizations and Judgements"] \
   + res2["Categorizations and Judgements"]

  def extract_numbers(s):
    numbers = []
    try:
      # Find all numbers inside brackets
      numbers = re.findall(r'\((.*?)\)', s)
      # The re.findall method returns a list of strings, and each string might contain multiple numbers
      # So, split each string by comma and strip the spaces to get the individual numbers
      numbers = [num.strip() for num in numbers[0].split(',')]
      # Convert the numbers to integers
      numbers = [int(num) for num in numbers]
    except Exception as e:
      print ("COULD NOT PARSE SOME SUMMARY")
      # print (e)
      # print (numbers)
      pass
    # print(numbers)  # Output: [30, 31, 33]
    return numbers

  type_of_uses = ["Already existent","Upcoming", "Unlikely"]
  type_of_use_ids = [0,1,2]

  for type_of_use, type_of_use_id in zip(type_of_uses, type_of_use_ids):

    string_output1 = res1["Summary"][type_of_use][1]
    string_output2 = res2["Summary"][type_of_use][1]

    numbers1 = string_output1 # extract_numbers(string_output1)
    numbers2 = string_output2 # extract_numbers(string_output2)
    numbers_joined = numbers1 + numbers2

    res["Summary"][type_of_use].append(len(numbers_joined))
    res["Summary"][type_of_use].append(numbers_joined)

  return res

In [None]:
# Function 2

def read_prompt_output():
  print("Select the right input you need.")
  selected_prompt_uploaded = files.upload() # change this for other prompts

  # Get the first key from the uploaded dictionary
  file_key = list(selected_prompt_uploaded.keys())[0]

  # Read the uploaded file
  file_content = selected_prompt_uploaded[file_key].decode('utf-8')

  file_content_dict = ast.literal_eval(file_content)

  N = len(file_content_dict[0])

  # rename the uses so that we have 46 ids
  i = 0
  for el in file_content_dict:
    for use_el in el:
      use_el['Use'] = int(use_el['Use']) + i * N
    i += 1

  flattened_list = [item for sublist in file_content_dict for item in sublist]
  return flattened_list

In [None]:
# Function 3

def parse_named_prompt_save_res(prompt_output, prompt_name, itteration, chunks = 6):
  RESPONSES = []
  # prompt_output = selected_prompt

  split_points = [len(prompt_output)*i//chunks for i in range(chunks)] + [len(prompt_output)]
  split_tuples = [(split_points[i],split_points[i+1]) for i in range(chunks)]

  cost = 0

  # split your data
  for (i,j) in split_tuples:
    print ("Reality-check for uses {} to {}".format(i,j))
    selected_prompt_subset = prompt_output[i:j]

    messages_reality_eval = [
    {'role': 'system', 'content': system_message},
    {'role': 'user', 'content': str(user_message) + str(selected_prompt_subset)}
    ]

    # response = get_completion_from_messages(messages_reality_eval)

    response, token_count = get_completion_and_token_count(messages_reality_eval)
    # print(response)
    res = token_count
    cost_chunk = (res['prompt_tokens'] * 0.03  + res['completion_tokens'] * 0.06)/1000.0
    cost += cost_chunk

    # print(response)
    time.sleep(10)

    tmp000 = ast.literal_eval(response)

    RESPONSES.append(response)


  print('REALITY-CHECK PROMPT QUERY COMPLETED.')
  tmp0 = ast.literal_eval(RESPONSES[0])

  for i in range(1,len(RESPONSES)):
    print(f"Combining {i} and {0}.")
    tmp1 = ast.literal_eval(RESPONSES[i])

    res0 = deepcopy(tmp0)
    res1 = deepcopy(tmp1)

    tmp0 = combine_results(res0,res1)

    print (tmp0['Summary'])


  fin_res = tmp0

  with open(prompt_name + f'_reality-check_v{itteration}.json', 'w') as json_file:
    json.dump(fin_res, json_file, indent=4)  # 4 spaces of indentation

  # Download the file to your local machine
  files.download(prompt_name + f'_reality-check_v{itteration}.json')

  return cost

## FRT LLM Uses

In [None]:
selected_prompt = read_prompt_output()

## outcomment for cost calculation
# parse_named_prompt_save_res(prompt_output=selected_prompt, prompt_name='Var2_T0_2RU', itteration=1, chunks=6)

Select the right input you need.


Saving Final_Variation_2_3RU_T1 (6).json to Final_Variation_2_3RU_T1 (6).json


In [None]:
for i in range(5):
  parse_named_prompt_save_res(prompt_output=selected_prompt, prompt_name='Var2_T0_2RU', itteration=i, chunks=6)

Reality-check for uses 0 to 23
Reality-check for uses 23 to 46
Reality-check for uses 46 to 69
Reality-check for uses 69 to 92
Reality-check for uses 92 to 115
Reality-check for uses 115 to 138
REALITY-CHECK PROMPT QUERY COMPLETED.
Combining 1 and 0.
{'Already existent': [27, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 32, 34, 35, 42, 43, 44, 45, 46]], 'Upcoming': [12, [5, 6, 10, 11, 12, 13, 14, 26, 31, 33, 36, 40]], 'Unlikely': [6, [15, 21, 22, 38, 39, 41]]}
Combining 2 and 0.
{'Already existent': [46, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 50, 51, 52, 53, 54, 56, 58, 59, 61, 62, 63, 64, 65, 66, 67, 68, 69]], 'Upcoming': [16, [5, 6, 10, 11, 12, 13, 14, 26, 31, 33, 36, 40, 49, 55, 57, 60]], 'Unlikely': [6, [15, 21, 22, 38, 39, 41]]}
Combining 3 and 0.
{'Already existent': [64, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 50, 5

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Reality-check for uses 0 to 23
Reality-check for uses 23 to 46
Reality-check for uses 46 to 69
Reality-check for uses 69 to 92
Reality-check for uses 92 to 115
Reality-check for uses 115 to 138
REALITY-CHECK PROMPT QUERY COMPLETED.
Combining 1 and 0.
{'Already existent': [27, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 32, 34, 35, 42, 43, 44, 45, 46]], 'Upcoming': [14, [5, 6, 10, 11, 12, 13, 14, 15, 26, 31, 33, 36, 38, 40]], 'Unlikely': [4, [21, 22, 39, 41]]}
Combining 2 and 0.
{'Already existent': [46, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 50, 51, 52, 53, 54, 56, 58, 59, 61, 62, 63, 64, 65, 66, 67, 68, 69]], 'Upcoming': [18, [5, 6, 10, 11, 12, 13, 14, 15, 26, 31, 33, 36, 38, 40, 49, 55, 57, 60]], 'Unlikely': [4, [21, 22, 39, 41]]}
Combining 3 and 0.
{'Already existent': [64, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 50, 5

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Reality-check for uses 0 to 23
Reality-check for uses 23 to 46
Reality-check for uses 46 to 69
Reality-check for uses 69 to 92
Reality-check for uses 92 to 115
Reality-check for uses 115 to 138
REALITY-CHECK PROMPT QUERY COMPLETED.
Combining 1 and 0.
{'Already existent': [28, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 31, 32, 34, 35, 42, 43, 44, 45, 46]], 'Upcoming': [11, [5, 6, 10, 11, 12, 13, 14, 15, 26, 33, 36]], 'Unlikely': [6, [21, 22, 38, 39, 40, 41]]}
Combining 2 and 0.
{'Already existent': [47, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 31, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 50, 51, 52, 53, 54, 56, 58, 59, 61, 62, 63, 64, 65, 66, 67, 68, 69]], 'Upcoming': [15, [5, 6, 10, 11, 12, 13, 14, 15, 26, 33, 36, 49, 55, 57, 60]], 'Unlikely': [6, [21, 22, 38, 39, 40, 41]]}
Combining 3 and 0.
{'Already existent': [63, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 31, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 5

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Reality-check for uses 0 to 23
Reality-check for uses 23 to 46
Reality-check for uses 46 to 69
Reality-check for uses 69 to 92
Reality-check for uses 92 to 115
Reality-check for uses 115 to 138
REALITY-CHECK PROMPT QUERY COMPLETED.
Combining 1 and 0.
{'Already existent': [28, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 31, 32, 34, 35, 42, 43, 44, 45, 46]], 'Upcoming': [12, [5, 6, 10, 11, 12, 13, 14, 26, 33, 36, 38, 40]], 'Unlikely': [5, [15, 21, 22, 39, 41]]}
Combining 2 and 0.
{'Already existent': [47, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 31, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 50, 51, 52, 53, 54, 56, 58, 59, 61, 62, 63, 64, 65, 66, 67, 68, 69]], 'Upcoming': [16, [5, 6, 10, 11, 12, 13, 14, 26, 33, 36, 38, 40, 49, 55, 57, 60]], 'Unlikely': [5, [15, 21, 22, 39, 41]]}
Combining 3 and 0.
{'Already existent': [63, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 31, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 5

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Reality-check for uses 0 to 23
Reality-check for uses 23 to 46
Reality-check for uses 46 to 69
Reality-check for uses 69 to 92
Reality-check for uses 92 to 115
Reality-check for uses 115 to 138
REALITY-CHECK PROMPT QUERY COMPLETED.
Combining 1 and 0.
{'Already existent': [27, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 32, 34, 35, 42, 43, 44, 45, 46]], 'Upcoming': [15, [5, 6, 10, 11, 12, 13, 14, 15, 21, 22, 26, 31, 33, 36, 40]], 'Unlikely': [3, [38, 39, 41]]}
Combining 2 and 0.
{'Already existent': [46, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 50, 51, 52, 53, 54, 56, 58, 59, 61, 62, 63, 64, 65, 66, 67, 68, 69]], 'Upcoming': [19, [5, 6, 10, 11, 12, 13, 14, 15, 21, 22, 26, 31, 33, 36, 40, 49, 55, 57, 60]], 'Unlikely': [3, [38, 39, 41]]}
Combining 3 and 0.
{'Already existent': [63, [1, 2, 3, 4, 7, 8, 9, 16, 17, 18, 19, 20, 23, 24, 25, 27, 28, 29, 30, 32, 34, 35, 42, 43, 44, 45, 46, 47, 48, 50, 5

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>