# Model exploration

In this notebook we explore how our different model components work.

In [5]:
DATA_PATH = '../data/'

## Image-to-text

In [1]:
from transformers import pipeline

# create image_to_text model
image_to_text = pipeline("image-to-text", model="nlpconnect/vit-gpt2-image-captioning")

Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.


### Local images

In [7]:
text = image_to_text(DATA_PATH+"poem_images/0.jpg")[0]["generated_text"]
print("Description: ", text)



Description:  two pink flowers in a flower pot 


### Online images

Instead of a local path we can also pass an online image URL to the `image_to_text` function.

In [8]:
text = image_to_text("https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2F3.bp.blogspot.com%2F-2pq1tT_bwRQ%2FUSMp_PHAjtI%2FAAAAAAAAThM%2FPX2H_FV13w4%2Fs1600%2FMaple%2BPoetic%2BWallpaper.jpg&f=1&nofb=1&ipt=efde060abd000da0e74c17c7171a6a56909b67445a355f7e04afe2f094360b36&ipo=images")
text = text[0]["generated_text"]
print("Description: ", text)



Description:  [{'generated_text': 'a painting of a flower arrangement in a flowery sky '}]


### Multiple images

We can give the model a list of images to caption. The model will then caption each image individually.

In [6]:
texts = image_to_text([DATA_PATH+"poem_images/0.jpg", DATA_PATH+"poem_images/3.jpg"])
descriptions = [text[0]["generated_text"] for text in texts]
print("Descriptions: ", descriptions)



Descriptions:  ['two pink flowers in a flower pot ', 'a row of boats sitting on a dock ']


## Language model - GPT-2

In [8]:
from image_to_poem.language_model.gpt2 import GPT2Model

model_dir = "../models/language_models/model_20231123_190606/model/"

model = GPT2Model(model_dir)

poems = model.generate(num_return_sequences=3)


for i, poem in enumerate(poems):
    print("-"*10 + f"Poem {i+1}" + "-"*10)
    print(poem)
    print()
    

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


----------Poem 1----------
, that I had someone, my thoughts, for each
One, I've gone alone, I'm not alone at home in this garden?
Where I'm like, I've gone to the door. I'm not alone.
That was my face I've gone through the walls
 I will tell you, my dream and my tears;
 you're a child I am alone
, I know you are alone I have a life, I have been
; I'm going

----------Poem 2----------


Where I don't think I'm alone

And I'm dying alone



And I'm lonely
 I think of

We've given him my life

I am not who's too much
And I am not alone
 I am here
 I love and you never leave me alone.
But you're afraid
—I'm not a little you alone
 I am not alone.
My little mind
The shadow of my home
 I don

----------Poem 3----------
 with his own lies.I am with life, but you are aloneAnd you know me, alone alone,
My voice to stop my self

He was alone, to know not from your walls,
The word of our choiceBut that I'm alone and myself will stop;
With no thought I feel alone.

When I am alone, with a fear th

In [9]:
model.generate(prompt="this is a cat")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[" and I am a man in my day\n\nMy own father died alone I'm alone\n and I'm not.\nWho was an angel?\nAll my own\nDon't know\n I am\nThe end all of my soul, no soul.\nThat day I was brokenMy dreamAnd I'm not a thing\n\nMy family, I need a day; I'm alone, the shadow, the dark that is her\nMy father, I cannot make me"]

In [14]:
num_params = 0

params = [param for param in model.model.parameters()]

len(params)

148

In [20]:
model.model.get_parameter("transformer")

AttributeError: `transformer` is not an nn.Parameter

In [19]:
model.model

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50259, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=768, out_features=50259, bias=False)
)