**Text Generation**

This code defines a function generate_header(paragraph) which takes a paragraph of text as input and generates a header from it. It does this by:

Tokenizing the paragraph into words.
Removing stopwords (common words like "and", "the", etc.) and punctuation.
Finding the five most common significant words.
Creating and returning a header composed of these words, capitalized in title case.
The function requires NLTK's 'punkt' and 'stopwords' resources, which are downloaded at the beginning.

In [None]:
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
from collections import Counter
import string

# Make sure to download these resources from NLTK
nltk.download('punkt')
nltk.download('stopwords')

def generate_header(paragraph):
    # Tokenize the paragraph into words
    words = word_tokenize(paragraph)

    # Remove stopwords and punctuation
    words = [word.lower() for word in words if word.isalpha() and word.lower() not in stopwords.words('english')]

    # Find the most common words
    most_common_words = Counter(words).most_common(5)

    # Create a header using the most common words
    header = ' '.join(word for word, count in most_common_words)
    return header.title()



[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!



This code defines a function generate_header_with_ner(paragraph) which generates a header from a paragraph using named entity recognition (NER) with the spaCy NLP library. It processes the paragraph, extracts named entities (like people, places, organizations), identifies the three most common entities, and then creates a header using these entities in title case.

In [None]:
import spacy

# Load spaCy model
nlp = spacy.load('en_core_web_sm')

def generate_header_with_ner(paragraph):
    # Process the paragraph with spaCy
    doc = nlp(paragraph)

    # Extract named entities
    entities = [ent.text for ent in doc.ents]

    # Find the most common entities
    most_common_entities = Counter(entities).most_common(3)

    # Create a header using the most common entities
    header = ' '.join(entity for entity, count in most_common_entities)
    return header.title()


This code snippet creates a function generate_header_with_keyphrases(paragraph) that generates a header from a paragraph by identifying key phrases. It uses the RAKE (Rapid Automatic Keyword Extraction) algorithm from the rake_nltk library, which works in conjunction with NLTK's stopwords. The function:

Initializes RAKE using NLTK's list of stopwords.
Extracts keywords and key phrases from the input paragraph.
Selects the top three ranked key phrases.
Joins these phrases with a '|' symbol to form a header.
Returns the header as the output.

In [None]:
pip install rake-nltk


Collecting rake-nltk
  Downloading rake_nltk-1.0.6-py3-none-any.whl (9.1 kB)
Installing collected packages: rake-nltk
Successfully installed rake-nltk-1.0.6


In [None]:
from rake_nltk import Rake
import nltk

# Make sure to download NLTK stopwords
nltk.download('stopwords')

def generate_header_with_keyphrases(paragraph):
    # Initialize RAKE by using NLTK's stopwords
    rake_nltk_var = Rake()

    # Extract keywords from text
    rake_nltk_var.extract_keywords_from_text(paragraph)
    key_phrases = rake_nltk_var.get_ranked_phrases()[:3]  # Get top 3 ranked phrases

    # Create a header using the key phrases
    header = ' | '.join(key_phrases)
    return header


ModuleNotFoundError: ignored

This code defines a function generate_header_from_text(input_text) that generates a header by summarizing the input text. It uses a summarization pipeline from the transformers library, which is based on models like BERT. The function:

Loads a summarization pipeline.
Checks and manages the length of the input text, truncating it to a maximum of 1024 tokens (a typical limit for BERT-based models).
Summarizes the (possibly truncated) text, with specified maximum and minimum lengths for the summary.
Returns the first summary as the header.

In [None]:
from transformers import pipeline

def generate_header_from_text(input_text):
    # Load a summarization pipeline
    summarizer = pipeline("summarization")

    # Check and manage text length for the model
    max_chunk_size = 1024  # BERT-based models typically have a max length of 1024 tokens
    if len(input_text) > max_chunk_size:
        # Here you can add a method to split the text into chunks
        # For simplicity, we're truncating the text in this example
        input_text = input_text[:max_chunk_size]

    # Generate summary
    summary = summarizer(input_text, max_length=75, min_length=30, do_sample=False)

    # Return the first summary (can be used as a header)
    return summary[0]['summary_text']

This code defines a function generate_header_with_t5(input_text) that uses the T5 (Text-To-Text Transfer Transformer) model to generate a header by summarizing the provided text. It works as follows:

A T5 model (like 't5-large') and its tokenizer are loaded from the transformers library.
The text is preprocessed by prefixing it with "summarize: " to instruct the model that summarization is required.
The text is encoded to token IDs, with a maximum length of 512 tokens and truncation if necessary.
A summary is generated using the model, with specified constraints on the maximum and minimum length of the output.
The summarized text is decoded from token IDs to a string, omitting any special tokens.
This summarized text is returned as the header.

In [None]:
from transformers import T5ForConditionalGeneration, T5Tokenizer

def generate_header_with_t5(input_text):
    model_name = "t5-large"  # You can choose other versions like 't5-base' or 't5-large'
    tokenizer = T5Tokenizer.from_pretrained(model_name)
    model = T5ForConditionalGeneration.from_pretrained(model_name)

    # Preprocess the text
    input_text = "summarize: " + input_text
    input_ids = tokenizer.encode(input_text, return_tensors="pt", max_length=512, truncation=True)

    # Generate the output
    summary_ids = model.generate(input_ids, max_length=50, min_length=20, length_penalty=2.0, num_beams=4, early_stopping=True)
    header = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

    return header


This code snippet defines a function generate_header_with_bart(input_text) which uses the BART (Bidirectional and Auto-Regressive Transformers) model for summarization. The BART model and tokenizer from the transformers library are used to generate a concise header from the input text. Here's a quick overview:

The BART model and tokenizer specifically the 'facebook/bart-large-cnn' version, are loaded.
The input text is encoded to a format suitable for the model, with a maximum length of 1024 tokens, truncating longer texts.
The model generates a summary, using a beam search with 4 beams, a maximum output length of 50 tokens, and early stopping for efficiency.
The generated summary is decoded back to text, skipping any special tokens.
This decoded summary is returned as the header, providing a condensed version of the main content of the input text.

In [None]:
from transformers import BartForConditionalGeneration, BartTokenizer

def generate_header_with_bart(input_text):
    tokenizer = BartTokenizer.from_pretrained('facebook/bart-large-cnn')
    model = BartForConditionalGeneration.from_pretrained('facebook/bart-large-cnn')

    # Encode the text
    inputs = tokenizer([input_text], max_length=1024, return_tensors='pt', truncation=True)

    # Generate summary
    summary_ids = model.generate(inputs['input_ids'], num_beams=4, max_length=50, early_stopping=True)
    header = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

    return header



The function generate_header_with_gpt2(input_text) uses the GPT-2 model to generate a header for a given text. It follows these steps:

Initializes the GPT-2 tokenizer and model.
Prepends the input text with a prompt to guide the model to generate a title.
Encodes the adjusted input, truncating if necessary.
Generates a short text (up to 10 new tokens) using the model.
Decodes and returns this generated text as the header. If no output is generated, it returns a default message.

In [None]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer

def generate_header_with_gpt2(input_text):
    tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
    model = GPT2LMHeadModel.from_pretrained("gpt2")

    # Adjusted prompt for header generation
    adjusted_input = f"Generate a title for the text:\n\n{input_text}"

    # Encode input text and truncate if too long
    inputs = tokenizer.encode_plus(adjusted_input, return_tensors='pt', truncation=True, max_length=512, add_special_tokens=True)

    # Generate text within the model's limits
    try:
        outputs = model.generate(
            inputs['input_ids'],
            attention_mask=inputs['attention_mask'],
            max_new_tokens=10,  # Control the number of new tokens to generate
            num_return_sequences=1,
            no_repeat_ngram_size=2
        )

        # Check if any output was generated
        if outputs.shape[0] > 0:
            header = tokenizer.decode(outputs[0], skip_special_tokens=True)
            return header
        else:
            return "No header generated"
    except Exception as e:
        print(f"An error occurred: {e}")
        return "Unable to generate header"


The generate_header_with_gpt_neo function uses the GPT-Neo model to generate a header for given text. It involves:

Initializing the GPT-Neo model and tokenizer.
Creating a prompt with the input text for header generation.
Encoding and truncating the prompt to a maximum length.
Generating a concise output using the model.
Decoding and returning the output as a header.

In [None]:
from transformers import GPTNeoForCausalLM, GPT2Tokenizer

def generate_header_with_gpt_neo(input_text, model_name="EleutherAI/gpt-neo-2.7B"):
    tokenizer = GPT2Tokenizer.from_pretrained(model_name)
    model = GPTNeoForCausalLM.from_pretrained(model_name)

    prompt = f"Generate a concise and informative header for the following paragraph: {input_text}"

    inputs = tokenizer.encode(prompt, return_tensors="pt", truncation=True, max_length=512)

    # Generate the output
    output = model.generate(inputs, max_length=60, num_return_sequences=1, temperature=0.7)
    header = tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)

    return header.strip()


Testing

In [None]:
input_text = "Spoiler alert: this recap is for viewers of Deutschland 83 on Channel 4 and Walter Presents, please refrain from posting details from later episodes if youâ€™ve see more. Catch up on episode four here Now thatâ€™s more like it. This was easily the best episode so far, a tense and dark hour in which a number of characters made decisions that will have far-reaching effects. Most of them bad, if Iâ€™m any judge. I understand those who feel that Deutschland 83 has teetered too far into the absurd to work as a serious drama, but I also felt that this week addressed many of those issues, giving us some good character development into the bargain.Is Martin still the worst spy in the universe? Quite probably, but again Iâ€™d argue that this is part of the point â€“ as much as a spy drama, itâ€™s a story about young people struggling to make the correct choices while being conspicuously and continually let down by authority figures. What the Wingers seem intent on doing is giving us a world populated by people who are trying desperately to do the right thing, but are repeatedly doomed to do wrong by the time in which they are living. The west Poor old Martin. No sooner had he shed his inhibitions in the ashram (and to Bonnie Tyler, no less) than he found himself called back to duty and ordered to Berlin, ostensibly to help out his sick mother by giving her his kidney. Of course, Martinâ€™s missions are never as simple as that, and thus our hero was really in Berlin as a patsy, a convenient way of ensuring that a bunch of explosives was delivered to the terrorist who wanted them â€“ with devastating effect. The bomb itself really happened and was really the work, as stated, of Carlos the Jackal. As to the mysterious planter of the bomb, Iâ€™m presuming he was another patsy of a kind, an associate of Carlosâ€™s who agreed to plant the bomb at the French cultural centre. Also worth noting: the suggestion that the East German government had some involvement in the West German bombing is not a new one and the case against Carlos was partially built from old Stasi files.Martin wasnâ€™t the only one having a bad time this week, as Alex realised that loveâ€™s young dream was not quite as rosy as he had hoped, a situation that led to him storming off from the Mansion of Manipulation, initially seeking solace in the ashram (like everyone else before him) before turning up and offering his services to the DDR. Itâ€™s rather typical of Tobiasâ€™s luck that, even though Alex rejected both his idea of returning to the army and his protestations of love, he ended up offering to spy for the East Germans without Tischbier having to sully his hands by suggesting it. That man is so smooth that even his failures turn into some sort of success. The east Over in the east, betrayal was the name of the game as Annett, not content with rejecting Thomas, chose to sell out his mobile library at the same time. Oh Annett, I will find this very hard to forgive, although I do wonder exactly what message she wanted to get to Martin. Did she think he wasnâ€™t going to turn up and help his mother? Because if that was the reason she contacted Schweppenstette, then I suspect she will end up regretting it. As for poor Thomas, now my favourite character on this show after his impassioned defence of literature, things donâ€™t look good for him at all. That will teach him to go talking about love and literature to the first innocent-looking blonde he meets. Nor are they looking good for Ingrid, even though a rather battered Martin did manage to make it in time to donate the kidney. Will she survive? A couple of weeks ago I would have said sure, but this show is getting notably darker by the week so, despite Lenoraâ€™s intervention, I would now put her chances at 50-50 at best.Stasi files â€¢ Martin really lost his innocence this week â€“ no sooner had he lectured Tobias about being a killer, than he finds himself murdering a stranger on a railway track. I do wonder how much longer he can square his conscience with the life he is having to lead. â€¢ It was also made clear that Tobias was responsible for Lindaâ€™s death â€“ he may not have driven the car, but thereâ€™s no doubting that he gave the order. â€¢ Still, I was impressed by the way in which Martin used my sisterâ€™s old technique of telling the truth so outlandishly that the other person doesnâ€™t believe you. Got us out of many a sticky situation with our parents, I can tell you. â€¢ Lenora (Maria Schrader) almost revealed a heart this week. Despite all the manipulation, I think she genuinely does love her sister, and she really sold the tension as she waited for Martin to arrive. (And Iâ€™m presuming that she is the reason why Martin made it East relatively fast, despite turning up bloody and battered to the checkpoint). â€¢ The Edel home really isnâ€™t a safe place for a fish. After Renate attempted to kill them off with champagne glasses last week, Ursula went a step further this week, dropping one on the ground as a minor act of revenge against her husband. â€¢ Talking of Ursula, I was amused by the relish with which she devoured Alexâ€™s toast. I understand why sheâ€™s protecting her son, but her commitment to that cause was still wildly entertaining. â€¢ One thing I wasnâ€™t sure of â€“ did Ursula intend Frau Netz to think Alex has Aids or was the secretary jumping to conclusions? â€¢ Regarding Frau Netz â€“ do we also think she might be one of Lenoraâ€™s secretaries, or is that a spy reveal too far? â€¢ I loved the way Yvonneâ€™s ashram mate took a moment to centre himself after Alexâ€™s outburst. Sometimes itâ€™s the small things that work. â€¢ I also continue to be amused by Walter Schweppenstette, despite his capacity for evil â€“ I was particularly taken by his discussion about Worcester sauce. Sorry, but Iâ€™m a sucker for a man who knows his condiments. Song of the week Only two songs again this week, but they were both well used. Iâ€™m giving the edge to Mad World by Tears for Fears just because it made this cover from Donnie Darko â€“ an improbable Christmas No 1 â€“ pop into my head for the first time in years. Quote of the week â€œGo on, give each other back rubs while the world goes up in a mushroom cloudâ€ â€“ Alex almost wins my heart with his impassioned take down of hippies. Weâ€™ve all been there, Alex. So what did you think â€“ do you approve of the increasingly dark tone? Who is in worse trouble, Alex or Thomas? What about Martin? Will Ursula continue to lie to her husband? And will Yvonne ever make it out of that ashram? As ever, all speculation and no spoilers welcome below â€¦"

In [None]:
input_text2 = "major political event gallop towards u speed bolting horse even two â€“ eu referendum trident debate â€“ running weird direction unpredictable velocity left us u london mayoral election could turn referendum housing issue city could unite many people across fundamental question concentration wealth ever fewer hand way live future look forward way child live see life getting easier harder accept inevitable avenue change ready explore eu referendum could like scottish referendum become conversation good europe â€“ extension good united kingdom â€“ would look like could debate purpose international cooperation could mean wage tax privatisation commonly held asset minded discus post capitalist sharing economy renewable energy moved freely across border fruit technological advance moved beyond appropriation monopoly would place start trident could trigger new thinking foreign policy could allow say loud one blindingly obvious fact cold war condition made threat mass annihilation look like sensible compact hostile nation longer pertain short time could overturn sclerotic norm replace genuinely modern ambitious thinking left could could allow diverted path nowhere chasing stick thrown gleeful opponent sadiq khan big leftie jeremy corbyn ever work together donâ€™t think exactly thing whatâ€™s constitutional basis labour coup left support eu stiffed greece lost humanitarian heart one thing easier leveraging leftiesâ€™ natural anti authoritarianism opening fissure turn chasm britney spear said george bush â€œhonestly think trust president every decision make support â€ fascinated remark remember worrying like scab googling check wording search term â€œbritneyâ€ â€œbushâ€ led wilderness let tell thought interested good conservative yielding authority really interested sense revulsion support anybody every decision made would indivisible dead normal time naturally subversive person never confront question whether capable obedience sort person would obey would never power normal time corbyn avowedly anti establishment leader opposition prevailing convention pretend heâ€™s comical sideshow normal opposition resume due course sooner later reality catch self styled â€œrealisticâ€ view corbynâ€™s pressing problem place find support among people far prefer critique deal â€“ account long take fire someone unable theyâ€™ve agreed heâ€™s right point person le comfortable authority anyone â€“ subject endless scornful commentary yet everyone progressive side â€“ whether parliamentary labour party constituency party member merely sympathetic idea â€“ need consider deal natural reluctance cheerleading allowing u neutralised without question labour mp whose opposition leader implacable disagree profoundly matter principle whether call people tory closet tory misdirected lib dems irrelevant salient point describe full party range also includes mp agree many corbynâ€™s idea worry electability cleave fundamental worry delivery mp worry play corbyn refuse endorse shoot kill rather considering whether shoot kill large enough pressing enough issue divide mp support trident donâ€™t want associated peacenik deeply wedded weapon mass destruction likely â€“ indeed iâ€™d say certainty â€“ conservative mp oppose trident wouldnâ€™t dream making issue ability right gloss internal difference frankly awe inspiring recent debate social housing cameron accused corbyn small c conservative thatâ€™s probably party written cv using insult nobody uttered squeak itâ€™s impressive itâ€™s inimitable left cannot mimic obedience right copy pursuit unity sake need find kind loyalty doesnâ€™t preclude critical distance harmony accommodate audible difference time unique opportunity brought new labour leadership constellation flashpoint could light politics doesnâ€™t need u agree need u"

In [None]:
input_text3 = "another argument often made conscription aside one editorial january â€œit would good young peopleâ€ becomes difficult government engage unpopular war volunteer armed force made largely young working class ethnic minority men limited job opportunity middle upper class parent quite happy see others fighting however child involved perhaps come home body bag much wider objection war certainly case americaâ€™s vietnam venture came end following major demonstration lobbying middle class people political persuasion joseph lockeretz london â€¢ editorial recognising courage refused fight failed mention lion hearted first world war lance corporal william coltmanâ€™s award vc dcm bar mm bar made british armed forcesâ€™ decorated rank distinction still hold deep religious conviction would allow kill served throughout conflict unarmed stretcher bearer personal modesty authoritiesâ€™ problem contradiction hero would take arm mean today bravest soldier people never heard geoff meade st albans hertfordshire â€¢ act refer gave state right â€œforce unmarried men die countryâ€ soon extended married men â€“ may grandfather â€œcalled upâ€ leaving grandmother care five child adrianne leman london â€¢ join debate â€“ email guardian letter theguardian com"

In [None]:
input_text4=" Java 17, released in September 2021, marks a significant milestone in the evolution of the Java programming language. This version brings forth a plethora of new features, enhancements, and performance improvements, making it a highly anticipated release in the Java community.  In this essay, we will explore the key features of Java 17 and delve into its diverse range of use cases"

In [None]:

# Generate header
header = generate_header(input_text4)
print("Generated Header:", header)

Generated Header: Java Features Released September Marks


In [None]:
header = generate_header_from_text(input_text4)
print(header)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Your max_length is set to 75, but your input_length is only 74. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=37)


 Java 17, released in September 2021, marks a significant milestone in the evolution of the Java programming language . This version brings forth a plethora of new features, enhancements, and performance improvements . In this essay, we will explore the key features of Java 17 and delve into its diverse range of use cases .


In [None]:
# Generate header with NER
header = generate_header_with_ner(input_text4)
print("Generated Header with NER:", header)

Generated Header with NER: Java 17 Java September 2021


In [None]:
# Generate header with keyphrases
header = generate_header_with_keyphrases(input_text4)
print("Generated Header with Keyphrases:", header)

NameError: ignored

In [None]:
header = generate_header_from_text(input_text4)
print(header)

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
Your max_length is set to 75, but your input_length is only 74. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=37)


 Java 17, released in September 2021, marks a significant milestone in the evolution of the Java programming language . This version brings forth a plethora of new features, enhancements, and performance improvements . In this essay, we will explore the key features of Java 17 and delve into its diverse range of use cases .


In [None]:
header = generate_header_with_t5(input_text4)
print(header)

ImportError: ignored

In [None]:
# Example usage
header = generate_header_with_bart(input_text4)
print(header)


vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]



 Java 17, released in September 2021, marks a significant milestone in the evolution of the Java programming language. This version brings forth a plethora of new features, enhancements, and performance improvements. In this essay, we will explore the key features


In [None]:
# Example usage
header = generate_header_with_gpt2(input_text2)
print(header)


NameError: ignored

In [None]:
# Example usage
header = generate_header_with_gpt_neo(input_text4)
print(header)

tokenizer_config.json:   0%|          | 0.00/200 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/90.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.46k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/10.7G [00:00<?, ?B/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Generate a concise and informative header for the following paragraph:  Java 17, released in September 2021, marks a significant milestone in the evolution of the Java programming language. This version brings forth a plethora of new features, enhancements, and performance improvements, making it a highly anticipated release in the Java community.  In this essay, we will explore the key features of Java 17 and delve into its diverse range of use cases.




---



Testing Langchain and OLLAMA

In [None]:
from transformers import GPTNeoForCausalLM, GPT2Tokenizer

def generate_header_with_gpt_neo(input_text, model_name="EleutherAI/gpt-neo-2.7B"):
    tokenizer = GPT2Tokenizer.from_pretrained(model_name)
    model = GPTNeoForCausalLM.from_pretrained(model_name)

    prompt = f"Generate a concise and informative header for the following paragraph: {input_text}"

    inputs = tokenizer.encode(prompt, return_tensors="pt", truncation=True, max_length=512)

    # Generate the output
    output = model.generate(inputs, max_length=60, num_return_sequences=1, temperature=0.7)
    header = tokenizer.decode(output[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)

    return header.strip()


In [None]:
# Example usage
header = generate_header_with_gpt_neo(input_text4)
print(header)

tokenizer_config.json:   0%|          | 0.00/200 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/90.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.46k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/10.7G [00:00<?, ?B/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Generate a concise and informative header for the following paragraph:  Java 17, released in September 2021, marks a significant milestone in the evolution of the Java programming language. This version brings forth a plethora of new features, enhancements, and performance improvements, making it a highly anticipated release in the Java community.  In this essay, we will explore the key features of Java 17 and delve into its diverse range of use cases.


In [None]:
pip install langchain

Collecting langchain
  Downloading langchain-0.0.343-py3-none-any.whl (1.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.6.3-py3-none-any.whl (28 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting langchain-core<0.1,>=0.0.7 (from langchain)
  Downloading langchain_core-0.0.7-py3-none-any.whl (177 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m177.5/177.5 kB[0m [31m10.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langsmith<0.1.0,>=0.0.63 (from langchain)
  Downloading langsmith-0.0.67-py3-none-any.whl (47 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.0/47.0 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain)
  Downloading mars

In [None]:
!curl https://ollama.ai/install.sh | sh

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7902    0  7902    0     0  20302      0 --:--:-- --:--:-- --:--:-- 20365
>>> Downloading ollama...
############################################################################################# 100.0%
>>> Installing ollama to /usr/local/bin...
>>> Creating ollama user...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> The Ollama API is now available at 0.0.0.0:11434.
>>> Install complete. Run "ollama" from the command line.


In [None]:
!ollama serve | !ollama run llama2

/bin/bash: line 1: !ollama: command not found
2023/11/30 15:37:09 images.go:784: total blobs: 0
2023/11/30 15:37:09 images.go:791: total unused blobs removed: 0
2023/11/30 15:37:09 routes.go:777: Listening on 127.0.0.1:11434 (version 0.1.12)


In [None]:
!ollama run llama2

Error: could not connect to ollama server, run 'ollama serve' to start it


In [None]:
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = Ollama(model="llama2",
             callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))

llm("Tell me 5 facts about Roman history:")

ConnectionError: ignored