### Ch9_Security_Databricks Free Dolly - dolly-v2-12b model Demo

- dolly-v2-12b is a 12 billion parameter causal language model created by Databricks that is derived from EleutherAI's Pythia-12b and fine-tuned on a ~15K record instruction corpus generated by Databricks employees and released under a permissive license (CC-BY-SA)
- Trained on the Databricks machine learning platform that is licensed for commercial use
- trained on ~15k instruction/response fine tuning records databricks-dolly-15k generated by Databricks employees in capability domains from the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA and summarization.
- smaller models sizes:
   - dolly-v2-7b, a 6.9 billion parameter based on pythia-6.9b
   - dolly-v2-3b, a 2.8 billion parameter based on pythia-2.8b

##### Limitations:

- dolly-v2-12b struggles with: syntactically complex prompts, programming problems, mathematical operations, factual errors, dates and times, open-ended question answering, hallucination, enumerating lists of specific length, stylistic mimicry, having a sense of humor, etc. Moreover, we find that dolly-v2-12b does not have some capabilities, such as well-formatted letter writing, present in the original model.

#### System Requirement:
- The transformers library on a machine with GPUs

#### Install Required Dependencies:

In [1]:
!pip install "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2"

Collecting accelerate<1,>=0.16.0
  Downloading accelerate-0.24.1-py3-none-any.whl (261 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m261.4/261.4 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m
Collecting torch<2,>=1.13.1
  Downloading torch-1.13.1-cp310-cp310-manylinux1_x86_64.whl (887.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m887.5/887.5 MB[0m [31m1.2 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-runtime-cu11==11.7.99 (from torch<2,>=1.13.1)
  Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m849.3/849.3 kB[0m [31m25.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cudnn-cu11==8.5.0.96 (from torch<2,>=1.13.1)
  Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m557.1/557.1 MB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[?25hColl

##### Pipeline:
- The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering.

In [None]:
import torch
from transformers import pipeline
#Use pipeline function from transformer to load a custom InstructionTextGenerationPipeline
generate_text = pipeline(model="databricks/dolly-v2-7b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")


config.json:   0%|          | 0.00/819 [00:00<?, ?B/s]

instruct_pipeline.py:   0%|          | 0.00/9.16k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/databricks/dolly-v2-7b:
- instruct_pipeline.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


pytorch_model.bin:   0%|          | 0.00/13.8G [00:00<?, ?B/s]

In [2]:
generate_text

NameError: ignored

##### Use the pipeline to answer instructions

In [1]:
res = generate_text("Explain to me the difference between nuclear fission and fusion.")
print(res[0]["generated_text"])


NameError: ignored

In [14]:
%timeit

res = generate_text("what is deep Learning?")
print(res[0]["generated_text"])




Deep Learning is a branch of machine learning focusing on the model architecture, optimization, and training algorithms. Deep learning models are typically non-linear functions of the input data that are capable of reasoning about high-level abstractions. Deep learning models require large amounts of labeled data in order to generalize well. In contrast to other machine learning models that have been successful in image recognition and search, deep learning models have broad applicability across many industries and use-cases. For these reasons, deep learning is sometimes described as the ‘work of Art’ among machine learning models.


In [15]:
res = generate_text("Who was the first to perform and publish careful experiments aiming at the definition of an international temperature scale on scientific grounds ?")
display(res[0]["generated_text"])


'The Belgian physicist Daniel Roberge (1727 – 1807) defined a temperature scale in which hotter objects register higher temperatures. His scale is now widely used throughout the world and was the starting point of further international agreement on temperature units.'

In [10]:
res = generate_text("When did Lincoln begin his political career?")
print(res[0]["generated_text"])

Lincoln began his political career in 1859 with his election to the Illinois state legislature.


In [11]:
res = generate_text("How many long was Lincoln's formal education?")
print(res[0]["generated_text"])

No formal education.


In [12]:
res = generate_text("summary about Lincoln")
print(res[0]["generated_text"])

Lincoln is a small town in Illinois, U.S. It is best known as the site of the Abraham Lincoln Presidential Library and Museum, where Lincoln was buried after his assassination.
