### Ch9_Security_Databricks Free Dolly - dolly-v2-12b model Demo

- dolly-v2-12b is a 12 billion parameter causal language model created by Databricks that is derived from EleutherAI's Pythia-12b and fine-tuned on a ~15K record instruction corpus generated by Databricks employees and released under a permissive license (CC-BY-SA)
- Trained on the Databricks machine learning platform that is licensed for commercial use
- trained on ~15k instruction/response fine tuning records databricks-dolly-15k generated by Databricks employees in capability domains from the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA and summarization.
- smaller models sizes:
   - dolly-v2-7b, a 6.9 billion parameter based on pythia-6.9b
   - dolly-v2-3b, a 2.8 billion parameter based on pythia-2.8b

##### Limitations:

- dolly-v2-12b struggles with: syntactically complex prompts, programming problems, mathematical operations, factual errors, dates and times, open-ended question answering, hallucination, enumerating lists of specific length, stylistic mimicry, having a sense of humor, etc. Moreover, we find that dolly-v2-12b does not have some capabilities, such as well-formatted letter writing, present in the original model.

#### System Requirement:
- The transformers library on a machine with GPUs

#### Install Required Dependencies:

In [None]:
!pip install "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2"

##### Pipeline:
- The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering.

In [None]:
import torch
from transformers import pipeline
#Use pipeline function from transformer to load a custom InstructionTextGenerationPipeline
generate_text = pipeline(model="databricks/dolly-v2-7b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")


##### Use the pipeline to answer instructions

In [None]:
res = generate_text("Explain to me the difference between nuclear fission and fusion.")
print(res[0]["generated_text"])


In [None]:
%timeit

res = generate_text("what is deep Learning?")
print(res[0]["generated_text"])


In [None]:
res = generate_text("Who was the first to perform and publish careful experiments aiming at the definition of an international temperature scale on scientific grounds ?")
display(res[0]["generated_text"])


In [None]:
res = generate_text("When did Lincoln begin his political career?")
print(res[0]["generated_text"])

In [None]:
res = generate_text("How many long was Lincoln's formal education?")
print(res[0]["generated_text"])

In [None]:
res = generate_text("summary about Lincoln")
print(res[0]["generated_text"])