## Running LLM Locally

In [1]:
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2", cache_dir="./model")
model = AutoModelForCausalLM.from_pretrained("openai-community/gpt2", cache_dir="./model")

In [4]:
from transformers import pipeline

## Inference

In [29]:
text_generation = pipeline("text-generation", model=model, tokenizer=tokenizer, max_length=100)

In [30]:
text_generation("The sky is")[0]["generated_text"]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


"The sky is falling. I was standing in front of the door when two men in glasses yelled at me to open it. Then they left, so I opened the door and they disappeared again, just to see you guys walking along for the ride home, yeah. We couldn't even make out your real faces or any parts of your bodies. They told me a story of how a white guy and a girl ran across our back and told me that they couldn't get back. I'm really glad"

In [9]:
text_generation("I love")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'I love my parents because I\'m so blessed to have a family of my own making it possible to do what I do and I\'m incredibly thankful to anyone who supported me.\n\n"I don\'t want to pretend I\'m not being honest with'}]

## Custom Inference

In [17]:
tokens = tokenizer.encode("Hello there what are you doing", return_tensors="pt")

In [18]:
tokens

tensor([[15496,   612,   644,   389,   345,  1804]])

In [20]:
outputs = model.generate(tokens)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [21]:
outputs

tensor([[15496,   612,   644,   389,   345,  1804,   994,  1701,   366,    32,
           732,   345,   466,   553,   314,  8712,    11,   366,  4360,   356,
           389,   477,   994,   284,   423,  1257,    13,  3914,   514,   466,
           340,   355,   881,   355,  1744,    13,  3914,   338,   655,   467,
           284,   262, 17374,   290,  2822,   617,  1243,   526,   198,   198]])

In [23]:
response = tokenizer.decode(outputs[0])

In [24]:
response

'Hello there what are you doing here?" "Awe you do," I replied, "but we are all here to have fun. Let us do it as much as possible. Let\'s just go to the mall and buy some things."\n\n'

In [59]:
tokenizer("The sky is blue", return_tensors="pt")

{'input_ids': tensor([[ 464, 6766,  318, 4171]]), 'attention_mask': tensor([[1, 1, 1, 1]])}

In [60]:
tokenizer.eos_token_id

50256

In [41]:
model.generate(tokenizer.encode("The sky is", return_tensors="pt"), max_length=20).shape[1]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


20

## Inference Function

In [63]:
def generate_text(max_length=1, seed_text=""):
    encoded = tokenizer(seed_text, return_tensors="pt")
    attention_mask = encoded["attention_mask"]
    pad_token_id = tokenizer.eos_token_id
    max_length += encoded["input_ids"].shape[1]
    
    outputs = model.generate(encoded["input_ids"], attention_mask=attention_mask, pad_token_id=pad_token_id, max_length=max_length)
    response = tokenizer.decode(outputs[0])
    return response
    

In [65]:
print(generate_text(seed_text="The sky is", max_length=100))

The sky is the limit."

The wind's force

Even if a man like Kukkou didn't think about moving up or down, the wind's force could not be compared to that of a wind. This wind's power was comparable to the distance travelled by an aircraft.

The wind was only at its strongest.

Although this wind was extremely strong, it could not rival the strength of a plane.

Only even after all these years, it would have already been


## Streaming text

In [66]:
from transformers import TextStreamer

In [69]:
question = "I am sure that"
encoded = tokenizer(question, return_tensors="pt")

In [73]:
streamer = TextStreamer(tokenizer, skip_prompt=False)
outputs = model.generate(**encoded,
                   streamer=streamer,
                   pad_token_id=tokenizer.eos_token_id,
                   max_length=100,
                   top_p=0.8)

I am sure that all of the other characters would benefit from more detailed depictions of their actions."

The Star Wars prequels did not have a scene where Han Solo and Chewbacca were having sex.

Star Wars: The Force Awakens has seen a host of sex scenes and other scenes where characters in the films have sex.

Star Wars: Episode VII - A New Hope was set to star Kylo Ren in the film, which was also directed by J.J.


In [74]:
outputs

tensor([[   40,   716,  1654,   326,   477,   286,   262,   584,  3435,   561,
          4414,   422,   517,  6496, 44528,   286,   511,  4028,   526,   198,
           198,   464,  2907,  6176,   662,   421,  1424,   750,   407,   423,
           257,  3715,   810,  9530, 20284,   290,  2580, 39346, 43552,   547,
          1719,  1714,    13,   198,   198,  8248,  6176,    25,   383,  5221,
         39613,   468,  1775,   257,  2583,   286,  1714,  8188,   290,   584,
          8188,   810,  3435,   287,   262,  7328,   423,  1714,    13,   198,
           198,  8248,  6176,    25,  7922, 19691,   532,   317,   968, 13408,
           373,   900,   284,  3491, 39859,    78,  7152,   287,   262,  2646,
            11,   543,   373,   635,  7924,   416,   449,    13,    41,    13]])

## with Threading

In [75]:
from transformers import TextIteratorStreamer

In [76]:
from threading import Thread

In [82]:
inputs = tokenizer(["The sky is."], return_tensors="pt")

In [83]:
streamer = TextIteratorStreamer(tokenizer, skip_prompt=False)
generation_kwargs = dict(inputs, streamer=streamer, max_new_tokens=100)

thread = Thread(target=model.generate, kwargs=generation_kwargs)

In [84]:
thread.start()
for new_text in streamer:
    print(new_text, end="")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


The sky is.

You can't see things from that space but the sky is a place so we're going to start looking a little deeper into it. We have a lot of new technology and exciting things ahead of us and this is just more of an engineering question…

NARRATOR: Dr. Ettorr said, "We've just identified a new form of particle physics, called the super-physics' of particles which are actually quantum waves. That's exactly what we are

In [94]:
from transformers import TextIteratorStreamer
from threading import Thread

def get_stream_response(input_text="", max_length=100):
    streamer = TextIteratorStreamer(tokenizer, skip_prompt=False)
    inputs = tokenizer(input_text, return_tensors="pt")
    generation_kwargs = dict(inputs, streamer=streamer, max_new_tokens=max_length)
    thread = Thread(target=model.generate, kwargs=generation_kwargs)
    thread.start()
    for new_text in streamer:
        print(new_text, end="")

In [95]:
get_stream_response("The sky is")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


The sky is red. I have heard that there are other beings like me alive." ―Alyeanna [src]

Alyeanna was named after a mother, a daughter and wife who she married to her lover, Alysanna, later on to a girl named Amida who was her child's name[9] and whom she taught to her daughter, Alysanna Alysanna, at age 17[10]. Alyeanna was not only Alysanna's step-mother

In [97]:
get_stream_response("who are you", 1000)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


who are you?) or who are your own parents.

In the first few seconds of pregnancy, the mother doesn't think she's got everything. "Oh, no, it's okay," she says. "It's just that I'm scared."

Some moms don't get a chance to get back in touch. Some are unable to cope with daily physical activity. Some try to prevent or even prevent pregnancy and make the worst decision that they can. There are, of course, women who need to take extra time to get through an emotional pregnancy.

So don't be intimidated, even by people who care about your wellbeing and are willing to pay for such attention. They may be more likely to respond with a desire to have it done than the people you ask but don't always know how they can make that happen. You should seek help right now if you have to get involved.

If you want to get to grips with a healthy pregnancy, look no further than this week's Women.com survey. It offers answers (and you might be interested in participating at the event) about women looking 