## Together AI API 
- [python api docs](https://docs.together.ai/docs/inference-python)


In [1]:
import together

In [2]:
together.api_key = ''

## Select Model

In [3]:
# see available models
model_list = together.Models.list()

print(f"{len(model_list)} models available")

# print the first 10 models on the menu
model_names = [model_dict['name'] for model_dict in model_list]
model_names[:10]

117 models available


['Austism/chronos-hermes-13b',
 'DiscoResearch/DiscoLM-mixtral-8x7b-v2',
 'EleutherAI/llemma_7b',
 'Gryphe/MythoMax-L2-13b',
 'Meta-Llama/Llama-Guard-7b',
 'Nexusflow/NexusRaven-V2-13B',
 'NousResearch/Nous-Capybara-7B-V1p9',
 'NousResearch/Nous-Hermes-Llama2-13b',
 'NousResearch/Nous-Hermes-Llama2-70b',
 'NousResearch/Nous-Hermes-llama-2-7b']

## Pull up Mistral Models

In [4]:
[x for x in model_names if 'mistral' in x.lower()]

['Open-Orca/Mistral-7B-OpenOrca',
 'mistralai/Mistral-7B-Instruct-v0.1',
 'mistralai/Mistral-7B-v0.1',
 'mistralai/Mixtral-8x7B-Instruct-v0.1',
 'teknium/OpenHermes-2-Mistral-7B',
 'teknium/OpenHermes-2p5-Mistral-7B',
 'mistralai/Mixtral-8x7B-v0.1']

In [5]:
model_to_use='mistralai/Mixtral-8x7B-Instruct-v0.1'

In [6]:
output = together.Complete.create(
  prompt = "<human>: What is the best christmas song of the 21st Century?\n<bot>:", 
  model = model_to_use, 
  max_tokens = 256,
  temperature = 0.8,
  top_k = 60,
  top_p = 0.6,
  repetition_penalty = 1.1,
  stop = ['<human>', '\n\n']
)

# print generated text
print(output['output']['choices'][0]['text'])

The best Christmas song of the 21st Century is subjective and depends on personal preference. However, a popular choice among many is "All I Want for Christmas Is You" by Mariah Carey, which was released in 1994 but gained significant popularity in the 21st century. Other notable songs include "Underneath the Arches" by Josh Groban and "Where Are You Christmas?" by Faith Hill.


## Try Image Models

In [7]:
import base64

# generate image 
response = together.Image.create(prompt="Jolly snowman")

# save the first image
image = response["output"]["choices"][0]
with open("snowman.png", "wb") as f:
    f.write(base64.b64decode(image["image_base64"]))

## Didn't really work

## Try for a RAG

## Function to get answers

In [12]:
model_to_use

'mistralai/Mixtral-8x7B-Instruct-v0.1'

In [51]:
def ask_llm(question,
            system_prompt=system_prompt,
            max_tokens=2048, 
            model=model_to_use):

    # this format appears to be necessary
    prompt = f"<human>: {question}\n<bot>:"
    response =  together.Complete.create(
                  prompt = prompt, 
                  model = model, 
                  max_tokens = max_tokens,
                  temperature = 0.8,
                  top_k = 60,
                  top_p = 0.6,
                  repetition_penalty = 1.1,
                  stop = ['<human>', '\n\n']
                )

    return response['output']['choices'][0]['text']

## Test Message Formatting

In [54]:
question = 'what is narcissism?'

In [None]:
answer = ask_llm(question,model='NousResearch/Nous-Hermes-Llama2-70b', max_tokens=1024)

In [None]:
answer

## Test Function over set of models

In [35]:
max_tokens = 2048
model='mistral-7b-instruct'

In [46]:
models = [
    'mistralai/Mixtral-8x7B-Instruct-v0.1',
    'teknium/OpenHermes-2p5-Mistral-7B',
    'NousResearch/Nous-Hermes-llama-2-7b',
    'togethercomputer/llama-2-70b',
]

## Compare Model Answers

In [49]:
question = 'who in your estimation are the most influential computer scientists of the 21st century?'

In [None]:
model_answers = {}

for model in models:
    answer = ask_llm(question=question, model=model)
    model_answers[model] = answer

In [None]:
model_answers.keys()

In [42]:
model_answers['']

'Determining the most influential computer scientists of the 21st century can be subjective and depends on the criteria used. However, here are a few who have made significant contributions:'

## UPDATE FROM HERE
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1111

In [None]:
from pathlib import Path
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from sentence_transformers import SentenceTransformer
from langchain.document_loaders import UnstructuredMarkdownLoader
from langchain.document_loaders import TextLoader

# Use as a RAG

## Embed Articles from The Last Psychiatrist

In [69]:
# use tiny model for embeddings
model_id = "sentence-transformers/all-MiniLM-L6-v2"
embeddings = HuggingFaceEmbeddings(model_name=model_id)

.gitattributes: 100%|██████████████████████████████████████████████████████████████████████████████| 1.18k/1.18k [00:00<00:00, 191kB/s]
1_Pooling/config.json: 100%|██████████████████████████████████████████████████████████████████████████| 190/190 [00:00<00:00, 24.6kB/s]
README.md: 100%|██████████████████████████████████████████████████████████████████████████████████| 10.6k/10.6k [00:00<00:00, 6.21MB/s]
config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████| 612/612 [00:00<00:00, 348kB/s]
config_sentence_transformers.json: 100%|██████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 47.6kB/s]
data_config.json: 100%|███████████████████████████████████████████████████████████████████████████| 39.3k/39.3k [00:00<00:00, 16.0MB/s]
pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████| 90.9M/90.9M [00:02<00:00, 41.0MB/s]
sentence_bert_config.json: 100%|████████████████

In [72]:
# Load text data from a file using TextLoader
loader = TextLoader('data/who_can_know_how_much_randi_zu.txt')
documents = loader.load()

In [73]:
# format it for processing by embedding model
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
text_chunks = text_splitter.split_documents(documents)

In [96]:
vectorstore_of_docs = FAISS.from_documents(text_chunks, embedding=embeddings)

## Function to prep docs for encoding

In [159]:
def load_text_files_for_encoding(file_names, dir):
    documents = []
    for text_file in file_names:
        loader = TextLoader(str(directory) + '/' + text_file)
        documents.extend(loader.load())
    
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
    chunked_documents = text_splitter.split_documents(documents)

    return chunked_documents

## Encoding a bunch of documents

In [160]:
directory = Path('/home/nick/Documents/repos/llm_playground/data')

file_names = [file.name for file in directory.glob('*.txt')]

chunked_documents = load_text_files_for_encoding(file_names, directory)

In [164]:
vectorstore_of_docs = FAISS.from_documents(chunked_documents, embedding=embeddings)

## Test the embeddings

In [165]:
related_text = vectorstore_of_docs.similarity_search('narcissism',top_k=3)

In [166]:
related_text

[Document(page_content="who like her-- the people she has to settle for-- are... not great.Genetics took care of her body but the upbringing affected her vision: the childhood of never good enough filters her present reality, obscures it, she can't see what is plain to everyone else, e.g. she's beautiful. So the process is to uncover the reasons why her view of reality is distorted and help her realign with reality. Use insight to strengthen her damaged ego, or, if you want a ten step approach, block automatic thoughts. In short, to understand that she is good, that men do find her attractive, not just the brazen ones, not just jerks.IV.If you think of narcissism as grandiosity you miss the nuances, e.g. in her case the problem is narcissism without any grandiosity: she is so consumed with her identity (as not pretty) that she is not able to read, to empathize with, other people's feelings. She doesn't care to try because it conflicts with how she sees herself. Ergo: Giorgio Armani was

## Ask Question with Context

In [167]:
def ask_llm2(question,
            max_tokens=2048, 
            model,
           context=False,
           top_k=5,
           vectorstore=False):
    'Ask the perplexity api a question, with or without context, using whichever model you prefer'

    if context == True:
        # find relevant text and add it to question
        related_text = vectorstore.similarity_search(question,top_k)
        context = ' / '.join([related_text[i].page_content for i in range(top_k)])
        files_used = ','.join(set([related_text[i].metadata['source'] for i in range(top_k)]))
        context_string = f'{context} || from the files {files_used}'
        # update question string
        question =  f'''Use the following pieces of context to answer the question at the end. 
            Try to make the response brief and only answer the portion after Question:
            Do not refer to the text from this prompt outside of the context.
            Structure your answers as well composed english sentences and do not include '\n'
            
            {context_string}
            
            Question: {question}
            Answer: '''
    
    # this format appears to be necessary
    prompt = f"<human>: {question}\n<bot>:"
    response =  together.Complete.create(
                  prompt = prompt, 
                  model = model, 
                  max_tokens = max_tokens,
                  temperature = 0.8,
                  top_k = 60,
                  top_p = 0.6,
                  repetition_penalty = 1.1,
                  stop = ['<human>', '\n\n']
                )

    return response['output']['choices'][0]['text']

In [168]:
test_query = 'Why is Randi Zuckerberg well known?'

In [169]:
ask_ppl(test_query)

'Randi Zuckerberg is well known because she is the sister of Mark Zuckerberg, the founder and CEO of Facebook. She is also an entrepreneur and a philanthropist in her own right, and has been involved in several successful ventures and initiatives. Randi has been featured in numerous media outlets and has been recognized for her work and achievements in the tech industry.'