# Uncensored Open Source LLM models

This notebook is a showcase of how to use local LLM and the langchain toolkit.

In [1]:
!pip install langchain



In [2]:
!pip install llama-cpp-python



## Select the model

Select the model path. All models are located at `/mnt/HC_Volume_32195498/ai/` and are *uncensored* and have had their *bias removed*.

In [3]:
my_model_path="/mnt/HC_Volume_32195498/ai/Wizard-Vicuna-13B-Uncensored-GGML/Wizard-Vicuna-13B-Uncensored.ggml.q5_0.bin"

In [4]:
!pip install openai



## OpenAI LLM

In [5]:
from langchain.llms import OpenAI
import os

openai_llm = OpenAI(openai_api_key=os.environ["OPENAI_API_KEY"])

Could not import azure.core python package.


## Select the seed

In [6]:
my_seed = 99945383

## Import libraries

In [7]:
from langchain.llms import LlamaCpp
from langchain import LLMChain
from langchain.prompts import PromptTemplate
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.memory import ConversationBufferWindowMemory

## Build the prompt

Here we can frame the prompt and format it is such way that the language model is able to easier process it the way we want.

In [8]:
template = """Question: You're a friendly and to the point assistant ready to act as an encyclopedia of knowledge. My question is: {question}

Answer: Let's get straight to the point."""

prompt = PromptTemplate(template=template, input_variables=["question"])

## Loading the model
With `LlamaCpp` we can load the module.

```
    lora_base: Default is None. This is the path to the Llama LoRA base model.
    lora_path: Default is None. This is the path to the Llama LoRA. If None, no LoRa is loaded.
    n_ctx: Default is 512. This is the token context window.
    n_parts: Default is -1. This is the number of parts to split the model into. If -1, the number of parts is automatically determined.
    seed: Default is -1. This is the seed for random number generation. If -1, a random seed is used.
    f16_kv: Default is True. This decides whether to use half-precision for key/value cache.
    logits_all: Default is False. This decides whether to return logits for all tokens, not just the last token.
    vocab_only: Default is False. This decides whether to only load the vocabulary, no weights.
    use_mlock: Default is False. This forces the system to keep the model in RAM.
    n_threads: Default is None. This is the number of threads to use. If None, the number of threads is automatically determined.
    n_batch: Default is 8. This is the number of tokens to process in parallel. Should be a number between 1 and n_ctx.
    n_gpu_layers: Default is None. This is the number of layers to be loaded into GPU memory.
    suffix: Default is None. This is a suffix to append to the generated text. If None, no suffix is appended.
    max_tokens: Default is 256. This is the maximum number of tokens to generate.
    temperature: Default is 0.8. This is the temperature to use for sampling.
    top_p: Default is 0.95. This is the top-p value to use for sampling.
    logprobs: Default is None. This is the number of logprobs to return. If None, no logprobs are returned.
    echo: Default is False. This decides whether to echo the prompt.
    stop: Default is an empty list []. This is a list of strings to stop generation when encountered.
    repeat_penalty: Default is 1.1. This is the penalty to apply to repeated tokens.
    top_k: Default is 40. This is the top-k value to use for sampling.
    last_n_tokens_size: Default is 64. This is the number of tokens to look back when applying the repeat_penalty.
    use_mmap: Default is True. This decides whether to keep the model loaded in RAM.
    streaming: Default is True. This decides whether to stream the results, token by token.
```



In [9]:
# Callbacks support token-wise streaming
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
# Verbose is required to pass to the callback manager

# Make sure the model path is correct for your system!
llm = LlamaCpp(
    model_path=my_model_path, 
    callback_manager=callback_manager, 
    verbose=False,
    temperature=0.6,
    seed=my_seed,
    max_tokens=350,
    n_ctx=2048,
)

AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 


In [19]:
llm_chain = LLMChain(prompt=prompt, llm=llm)

question = "explain work with a moodboard with my team"

predict = llm_chain.predict(question=question)

Llama.generate: prefix-match hit


 A moodboard is a visual compilation of images, colors, textures, and other design elements that serve as inspiration for a project or brand identity. It can be created by individuals or teams to represent a particular style, theme, or feeling that they want to convey through their work. 

To create a moodboard with your team, you can start by brainstorming ideas and gathering images that reflect the overall vision for the project. You can use online tools like Pinterest or Canva to create digital moodboards, or you can print out physical collages using magazines, newspapers, and other materials.

Once you have a collection of inspiring images, you can start organizing them into different categories based on color scheme, texture, or theme. This will help you identify patterns and connections between the different elements, and ultimately guide your decision-making process when it comes to designing the final product.

A moodboard is a powerful tool for creative collaboration because i

# DuckDuckGo - Applying an agent

In [10]:
!pip install duckduckgo-search



In [11]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.agents import Tool, AgentExecutor, BaseSingleActionAgent
from langchain.tools import DuckDuckGoSearchRun

In [12]:
search = DuckDuckGoSearchRun()

tools = [
    Tool(
        name = "Intermediate Answer", # Depending on the AgentType this changes. I am using SELF_ASK_WITH_SEARCH rn.
        func=search.run,
        description="useful for when you need to answer questions about current events",
        return_direct=True
    ),
]


In [13]:
from langchain.agents import initialize_agent, Tool

researcher = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

In [14]:
self_ask_with_search = initialize_agent(tools, llm, agent=AgentType.SELF_ASK_WITH_SEARCH, verbose=True)
# self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?")

In [15]:
from langchain.chains import SimpleSequentialChain

# Journalistic AutoGPT

In [16]:
# This is the editorial staff.

from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history")

# The journalist
template = """You are a journalist from Vice. Given this search result, it is your job to write an intriguing and engaging expose about it.

Search result: {search_results}
Journalist: This is the expose:"""
prompt_template = PromptTemplate(input_variables=["search_results"], template=template)
journalist = LLMChain(
    llm=llm, 
    prompt=prompt_template, 
    verbose=True,
    memory=memory,
)

# The editor
template = """You are an editor from Vice. Given this article, it is your job suggestions improvement.

Research material: {proposal}
Editor: This is the suggestions:"""
prompt_template = PromptTemplate(input_variables=["proposal"], template=template)
editor = LLMChain(
    llm=llm, 
    prompt=prompt_template, 
    verbose=True,
    memory=memory,
)

# The journalist
template = """You are a journalist from Vice. Given these suggestions from the editor it is your just to decide which you want to implement.

Suggestion from editor: {suggestions}
Journalist: This is the new improved version according to suggestions from editor:"""
prompt_template = PromptTemplate(input_variables=["suggestions"], template=template)
second_journalist = LLMChain(
    llm=llm, 
    prompt=prompt_template, 
    verbose=True,
    memory=memory,
)

In [17]:
from langchain.chains import SimpleSequentialChain
overall_chain = SimpleSequentialChain(chains=[self_ask_with_search, journalist, editor, second_journalist], verbose=True)

In [18]:
review = overall_chain.run("write an article about the positive effects of drinking tea vs coffee")



[1m> Entering new SimpleSequentialChain chain...[0m


[1m> Entering new AgentExecutor chain...[0m
 Yes, it could be helpful to know what specifically you want me to focus on in terms of positive effects.
Follow up: Could you provide some examples of specific positive effects of drinking tea versus coffee?[32;1m[1;3m Yes, it could be helpful to know what specifically you want me to focus on in terms of positive effects.
Follow up: Could you provide some examples of specific positive effects of drinking tea versus coffee?[0m




Intermediate answer: [36;1m[1;3mCoffee drinkers can raise a mug to fiber, microbiome health and lowering risk for cancer and diabetes. But tea drinkers, do not despair. Tea is undoubtedly good for your blood pressure ... Caffeine in tea: While the exact amount of caffeine can vary depending on the specific type of tea you're sipping, it generally contains a much lower amount than coffee. For example, a cup of green tea 21 contains around 29 mg of caffeine, while the amount of caffeine in black tea 22 is slightly lower, at around 26 mg per cup. Some laboratory studies have shown that tea polyphenols can inhibit the growth of cancer cells; however, human studies have yielded mixed results. [5] 3. More Durable Energy Boost. While tea contains less caffeine than coffee when consumed in the same amount of servings, tea is high in L-theanine, which is a powerful antioxidant that also ... Coffee and tea both contain caffeine. A standard 8-ounce cup of coffee contains about 95 milligrams of

Llama.generate: prefix-match hit


 "Can Your Cup of Joe or Tea Be a Health Boost? - The debate between coffee and tea drinkers has been brewing for years, but recent research suggests that both beverages can have significant health benefits." 

The article goes on to explain how coffee and tea are similar in their caffeine content, with coffee containing more than tea. However, the article highlights studies that suggest that coffee drinkers may have a lower risk of developing certain types of cancer and diabetes. Tea drinkers also have some health benefits, including potentially lowering blood pressure and reducing the risk of heart disease.

The article goes on to discuss how both beverages can contribute to gut health, with studies suggesting that coffee may improve microbiome diversity and tea may promote the growth of beneficial bacteria. The article also highlights some potential drawbacks to consuming too much caffeine, including anxiety and insomnia.

Overall, the article concludes that both coffee and tea can 

Llama.generate: prefix-match hit


 
- To clarify the differences between coffee and tea, it may be helpful to provide a brief overview of their origins and how they are prepared. This could help readers better understand why these two popular beverages have different health benefits.
- It would also be beneficial to include specific examples of studies that demonstrate the potential health benefits of coffee and tea consumption. This could help readers better understand the science behind these claims and make more informed decisions about their own consumption habits.
- Finally, it may be helpful to provide some guidance on how much coffee or tea is considered safe for consumption. While there is no one-size-fits-all answer, providing some general guidelines could help readers avoid consuming too much caffeine and potentially experiencing negative side effects.
[1m> Finished chain.[0m
[38;5;200m[1;3m 
- To clarify the differences between coffee and tea, it may be helpful to provide a brief overview of their origin

Llama.generate: prefix-match hit




Coffee vs Tea: Which Is Healthier?
------------------------------
When it comes to choosing between coffee and tea, there are a few factors to consider when deciding which one is healthier for you. Both beverages have been linked to various health benefits, but they also contain different amounts of caffeine and other compounds that can affect your body in different ways.
Origins and Preparation:
----------------------
Coffee originated in Ethiopia and was first cultivated there in the 9th century. It is now one of the most widely consumed beverages in the world, with billions of cups being enjoyed every day. Coffee is typically brewed by pouring hot water over ground coffee beans, which releases the flavorful oils and soluble compounds that give coffee its unique taste and aroma.
Tea, on the other hand, originated in China and has been enjoyed there for thousands of years. There are many different types of tea, but they all come from the same plant, Camellia sinensis. Tea is typical

## Comments & Todo
This solution is still quite unstable. It would benefit from a larger context, splitting up the tokens and other such optimizations. It's of course able to perform the task using the `OpenAI` interface but this defies the purpose. Moreover that OpenAI suffers from the same lack of larger context. Until *storywriter* is available for `LlamaCpp` any solution is suboptimal afaik. Now, it is possible to run Storywriter with other toolings than `LlamaCpp` but that's beyond the scope of this experimentation. 

It seems to need to be ran a few times before it understands how to use the search tool.

`n_ctx` was key for `LlamaCpp` but max is 2048. But this is workable for now. I've chained several now having it editorialize eachother.