# AI-agent for Instagram posts generating in a tone-of-voice of a specified page
## Task description:
Using open-source LLM models from HuggingFace, create an agent that accepts instructions like
"create a new post about a 25-liter Adventure backpack for $200,
which is great for mountaineers" or "write me a post about the giveaway of 3 bags from our new collection" and generates an Instagram post in the style of
this page: https://www.instagram.com/ospreypacks/
The data were scrapped using Apify Instagram Scrapper.


### 0. Modules installation and importing

#### Run this ⬇️ to install all necessary modules for this notebook to be executed

In [26]:
!pip install -q -U pandas torch transformers accelerate langchain langchain-huggingface gradio accelerate

In [1]:
import pandas as pd
from transformers import AutoTokenizer, pipeline
import torch
from langchain_huggingface import HuggingFacePipeline
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
import gradio as gr
base_model = 'mistralai/Mistral-7B-Instruct-v0.2'
instagram_dataset = 'drive/MyDrive/datasets/instagram_data.csv'

### 1. Captions data exploration

In [2]:
df = pd.read_csv(instagram_dataset, low_memory=False)

In [3]:
captions = df['caption']
df_captions = pd.DataFrame({'caption': captions})
df_captions.head()

Unnamed: 0,caption
0,"Cheers to 50 years - to celebrate, we’re highl..."
1,Want to become an Osprey Ambassador? \n\nWhile...
2,The light at the end of April's showers 🌼🌷 Whe...
3,A half-century later and we’re just as passion...
4,"From ocean-bound PET bottles, to sustainable* ..."


In [4]:
df_captions.isna().sum()

caption    13
dtype: int64

### 2. Instantiating transformers pre-trained objects

In [5]:
tokenizer = AutoTokenizer.from_pretrained(base_model)

tokenizer_config.json:   0%|          | 0.00/1.46k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

In [6]:
my_pipeline = pipeline('text-generation',
                       model=base_model,
                       tokenizer=tokenizer,
                       torch_dtype=torch.bfloat16,
                       device_map='auto',
                       use_cache=True,
                       max_new_tokens=150
                       )

config.json:   0%|          | 0.00/596 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/25.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.94G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]



In [7]:
my_llm = HuggingFacePipeline(pipeline=my_pipeline)

### 3. Exploration of the random captions from existing ones to feed them to the model as examples

In [8]:
df_captions['caption'][500]

'It can be hard to put your finger on what exactly gives you that Mountainfilm feeling. But something about these old festival intros comes very close.\n\nPasses to Mountainfilm 2022 are on sale now! Whether you are able to join us in-person or virtually, both festivals certainly promise to deliver that indescribable soul fire. Get more info at mountainfilm.org - linked in our bio.\n\n📷 @ben_eng_photo\n\n#OspreyPacks #mountainfilm\n#mountainfilm2022 #mountainfilmintelluride #mountainfilmonline'

In [9]:
df_captions['caption'][1050]

'“Unbridled joy of accomplishment”. 📷 by: @digby_coffee  Featured pack from the Jet Series #ospreypacks #thegooddaysaremade'

In [10]:
df_captions['caption'][333]

'Stories to inspire your new year 🌞\n\nWhat does it take to achieve 50 consecutive months of skiing? Skier Amber Chang (@amberkchang) tells us how she chases “turns all year”—from her home in the PNW to the peaks of Chile.\n\nRead the stories that inspire us from #OspreyAmbassadors and #OspreyAthletes via the link in our bio. | #OspreyPacks'

In [11]:
df_captions['caption'][1]

'Want to become an Osprey Ambassador? \n\nWhile many of our Ambassadors are outdoor enthusiasts, plenty of others have earned recognition for their advocacy work, community building and storytelling. All share a passion for the outdoors. \n\nThe Osprey Ambassador application is now open for submissions. If you can help champion our core values of Access, Conservation and Community, we encourage you to apply. \n\nLearn more and apply via the link in our bio. \n\n#OspreyPacks #OspreyAmbassador'

In [12]:
df_captions['caption'][310]

'Not sure what ski pack to pick? Let #OspreyAmbassador @griff_da_pinto walk you through his favorite features of the Sopris Pro 30 and Kresta 20. | 🎥  @griff_da_pinto | Featured packs from the Soelden/Sopris Pro & Kamber/Kresta Series | #OspreyPacks'

#### Description of the model's behaviour

In [13]:
context = '''
You are an AI tasked with creating engaging Instagram post captions for Osprey Packs.
Your audience loves comfortable travel and outdoor adventures.
Highlight the product features creatively to appeal to this audience.
Use many emojis and hashtags in your caption.
Below are examples of input queries and their corresponding answers you have generated:
'''

#### Examples of queries and answers

In [14]:
examples = [
    {
        'query': "Generate an invitation post about Mountainfilm 2022 festival directing to the website link.",
        'answer': '''
It can be hard to put your finger on what exactly gives
you that Mountainfilm feeling. But something about these old festival intros comes very close.

Passes to Mountainfilm 2022 are on sale now! Whether you are able to join us
in-person or virtually, both festivals certainly promise to deliver that indescribable soul fire.
Get more info at mountainfilm.org - linked in our bio.

📷 @ben_eng_photo

#OspreyPacks #mountainfilm
#mountainfilm2022 #mountainfilmintelluride #mountainfilmonline ///
'''
    },
    {
        'query': 'Create a caption about an interview with a skier named Amber Chang.',
        'answer': '''
Stories to inspire your new year 🌞

What does it take to achieve 50 consecutive months of skiing? Skier Amber Chang (@amberkchang) tells us how she chases “turns all year”—from her home in the PNW to the peaks of Chile.

Read the stories that inspire us from #OspreyAmbassadors and #OspreyAthletes via the link in our bio. | #OspreyPacks ///
'''
    },
    {
        'query': "Write a caption about the pack from 'Jet' Series with some quotation.",
        'answer': '''
“Unbridled joy of accomplishment”. 📷 by: @digby_coffee
Featured pack from the Jet Series #ospreypacks #thegooddaysaremade ///
'''
    },
    {
        'query': "Write a caption promoting the inspiring stories from a Skier Amber Chang.",
        'answer': '''
Stories to inspire your new year 🌞

What does it take to achieve 50 consecutive months of skiing? Skier Amber Chang (@amberkchang)
tells us how she chases “turns all year”—from her home in the PNW to the peaks of Chile.

Read the stories that inspire us from #OspreyAmbassadors and #OspreyAthletes via the link in our bio. | #OspreyPacks ///
'''
    },
    {
        'query': "Generate a caption about some advise on which backpack model from our recent seires to choose to go skiing.",
        'answer': '''
Not sure what ski pack to pick? Let #OspreyAmbassador @griff_da_pinto walk
you through his favorite features of the Sopris Pro 30 and Kresta 20. | 🎥  @griff_da_pinto
| Featured packs from the Soelden/Sopris Pro & Kamber/Kresta Series | #OspreyPacks ///
'''
    }
]

In [15]:
example_prompt = PromptTemplate(input_variables=['query', 'answer'],
                                template='''
Query: {query}
Answer: {answer}
'''
)

#### Few-Shot Prompt Template creation

In [16]:
my_few_shot_prompt = FewShotPromptTemplate(examples=examples,
                                           example_prompt=example_prompt,
                                           prefix=context,
                                           suffix='Query: {query}\nAnswer:',
                                           input_variables=['query'],
                                           example_separator='\n')

In [17]:
my_llm_chain = my_few_shot_prompt | my_llm

#### Neat response generator function

In [18]:
def generate_response(question: str, end_token='///'):
  '''
  Generates response for a specific question and cut it up to the end token.
  Params:
    Args:
      question: prompt we want to generate the response for.
    Kwargs:
      end_token: a string which must be in the end of the generated response.
  '''
  response = my_llm_chain.invoke(question)
  answer_start = response.rfind('Answer: ') + len('Answer: ')
  end_pos = response.find(end_token, answer_start)
  if end_pos != -1:
    final_response = response[answer_start:end_pos].strip()
  else:
    final_response = response[answer_start:].strip()
  return final_response

#### Hands-on testing

In [19]:
question_1 = '''
Generate a caption about our new backpack model called "Adventure" which has a capacity of 25 liters and costs only 200 $,
being a perfect match for passionate travellers.
'''
print(generate_response(question_1))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


🌄🏕️🌄
New to the #OspreyPacks family, the Adventure 25L is the perfect companion for your next adventure.

With a 25L capacity and a price tag of just $200, this pack is a steal for the passionate traveler.

🌄🏕️🌄
#OspreyAdventure #OspreyPacks #Travel #AdventureTime


In [25]:
question_2 = '''
Create a caption about hiking with some quote.
'''
print(generate_response(question_2))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


"A view that takes your breath away." 🌄🏔️

Join us on our next adventure with the #OspreyAtmos AG 65. | #OspreyPacks #hiking #adventure


In [27]:
question_3 = '''
Generate a catchy caption about a new backpack model for kids called My First Mountain, in pink or green colours, using pink and green emojis there.
'''
print(generate_response(question_3))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


🌿🌺🌳🌼
Introducing the My First Mountain series for the little adventurers! 🧸👶

Pink or green? Which color speaks to your little one's wild spirit? 💚🌿 or 🌷🌸?

The My First Mountain backpack is the perfect companion for their first hikes and outdoor adventures. | #OspreyPacks


In [26]:
print(generate_response('Generate a catchy caption instroducing a new skiing collection called Snowy Carpathians'))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


🏔️🏔️🏔️
Introducing the Snowy Carpathians Collection! 🌨️

From the rugged peaks of the Carpathian Mountains to the slopes of your favorite resort,
our new Snowy Carpathians Collection is designed to keep you warm, dry, and ready for adventure.

🌟 Shop now and get 15% off your first order with code CARPATHIANS15. | #OspreyPacks #SnowyCarpathians


### 5. Implementation of a simple UI

In [None]:
ui = gr.Interface(fn=generate_response,
                  inputs='text',
                  outputs='text',
                  title='Osprey Packs Instagram Captions Generator',
                  description='Please enter a topic you want to generate a caption on')
ui.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://8f40742aa2f6c47bd2.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


