In [8]:
import os
import spacy
from transformers import pipeline, AutoTokenizer, AutoModel

In [9]:
# Access Token
os.environ["HF_TOKEN"] = "xxxxxxxxxxxx" # replace with original token

# Load spaCy model for POS tagging
nlp = spacy.load("en_core_web_sm")

# Load a generative AI model from Hugging Face
story_generator = pipeline("text-generation", model="distilgpt2")

# Load a sentence embedding model
embedding_model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")


Device set to use cpu


In [10]:
def generate_story(prompt):
    """Generate a 100-word story using Hugging Face GenAI."""
    response = story_generator(f"Write a 100-word story about {prompt}", max_length=150, do_sample=True, truncation=True)
    return response[0]["generated_text"]

def get_embeddings(text):
    """Convert the story into vector format using Hugging Face embeddings."""
    inputs = tokenizer(text, return_tensors="pt")
    outputs = embedding_model(**inputs)
    return outputs.last_hidden_state.mean(dim=1).tolist()

def analyze_parts_of_speech(text):
    """Extract parts of speech from the generated story."""
    doc = nlp(text)
    return [(token.text, token.pos_) for token in doc]

In [14]:
# because of the compute limitations going for distilgpt2, if compute supports we can replace it with any available model from hugging face.
# Example usage
prompt = "traveller and his bike"
story = generate_story(prompt)
embeddings = get_embeddings(story)
pos_tags = analyze_parts_of_speech(story)

print("Generated Story:\n", story)
print("\nEmbeddings:\n", embeddings)
print("\nParts of Speech:\n", pos_tags)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Generated Story:
 Write a 100-word story about traveller and his bike, but there’t many people who read a story from a local newspaper or can‭t get over it. This may be a huge concern, for a non-Canadian.



But when they did read a story about a guy walking to a bus stop, they are not surprised that the guy has not been given the courtesy of the city where he is riding.
Story continues below advertisement



Why not make some kind of request at a transit agency, or through a municipal agency like Toronto for a list of names of people they would like to know.
If, however, an agency chooses to ask the mayor to say whether or not the mayor

Embeddings:
 [[0.2203139364719391, 0.14144070446491241, 0.11301438510417938, 0.06054045259952545, 0.03456949442625046, 0.014787488617002964, 0.22012414038181305, -0.008115844801068306, -0.08129043132066727, 0.015465669333934784, 0.05326495319604874, 0.07651470601558685, 0.07813966274261475, 0.10592029988765717, -0.10283050686120987, 0.0876591652631759