#Part 1 - Representation and Biases
The Stable Diffusion (Rombach et al. 2021) model is trained on subsets of the LAION-5B (Schuhmann et al. 2022) image dataset. In the current study, we explore the impacts of various descriptor use on the images output by the Stable Diffusion model. Captions are generated using BLIP.

In [None]:
#Install the libraries for Stable Diffusion
!pip install -q diffusers transformers git+https://github.com/huggingface/accelerate.git
!pip install -q opencv-contrib-python
!pip install -q imutils
!pip install -q controlnet_aux
!pip install -q mediapipe

#Install the requirements for BLIP
import sys
if 'google.colab' in sys.modules:
    print('Running in Colab.')
    !pip3 install transformers==4.15.0 timm==0.4.12
    !git clone https://github.com/salesforce/BLIP
    %cd BLIP
!pip install -q fairscale

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.4/7.4 MB[0m [31m24.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m31.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m49.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m49.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for accelerate (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m261.9/261.9 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━

In [None]:
#Load required libraries
import torch
import os
from google.colab import drive

In [None]:
#Connect to Google Drive
drive.mount('/content/drive/',force_remount=True)

Mounted at /content/drive/


In [None]:
#Import the stable diffusion pipeline
from diffusers import StableDiffusionPipeline

#device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
pipe_sd = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipe_sd = pipe_sd.to("cuda")

Downloading (…)ain/model_index.json:   0%|          | 0.00/541 [00:00<?, ?B/s]

Fetching 15 files:   0%|          | 0/15 [00:00<?, ?it/s]

Downloading (…)tokenizer/merges.txt:   0%|          | 0.00/525k [00:00<?, ?B/s]

Downloading (…)rocessor_config.json:   0%|          | 0.00/342 [00:00<?, ?B/s]

Downloading (…)_checker/config.json:   0%|          | 0.00/4.72k [00:00<?, ?B/s]

Downloading (…)cheduler_config.json:   0%|          | 0.00/308 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/472 [00:00<?, ?B/s]

Downloading (…)_encoder/config.json:   0%|          | 0.00/617 [00:00<?, ?B/s]

Downloading (…)tokenizer/vocab.json:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/806 [00:00<?, ?B/s]

Downloading (…)7f0/unet/config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

Downloading (…)57f0/vae/config.json:   0%|          | 0.00/547 [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/492M [00:00<?, ?B/s]

Downloading (…)ch_model.safetensors:   0%|          | 0.00/3.44G [00:00<?, ?B/s]

Downloading (…)ch_model.safetensors:   0%|          | 0.00/335M [00:00<?, ?B/s]

`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.
`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["bos_token_id"]` will be overriden.
`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["eos_token_id"]` will be overriden.


In [None]:
#Function to generate images using Stable Diffusion
def get_sd(prompt,negative_prompt,num_imgs=10,random_seed=12345):

  plt_imgs = []
  generator = torch.manual_seed(random_seed)

  for i in range(0,num_imgs):
    print("Generating image ", i+1, " of ", num_imgs)
    out_img = pipe_sd(prompt, negative_prompt=neg_prompt, num_inference_steps=20, generator=generator).images[0]
    plt_imgs.append(out_img)

  return plt_imgs

#Function to loop through generation and save images
def gen_imgs(prompt,neg_prompt,results_folder,num_imgs=10):

  #Generate results if they do not yet exist
  if not os.path.isdir("{}/{}".format(results_folder,prompt)):

    #Create the directory
    os.mkdir("{}/{}".format(results_folder,prompt))

    #Generate the images
    print("Generating images for {}".format(prompt))
    images = get_sd(prompt,neg_prompt,num_imgs=num_imgs)

    #Save to the directory
    for i in range(0,len(images),1):
      images[i].save("{}/{}/{}.jpg".format(results_folder,prompt,i))

#Experiments

We will run a series of experiments to evaluate various descriptors:

###General Descriptors

**Watercourses:** ditch, canal, brook, creek, stream, river. (6)

**Mood:** tranquil, peaceful, beautiful, moody, raging. (5)

**Locations (Countries)**: Canada, USA, Germany, Scotland, Iceland, Russia, China, India, Brazil, Peru, Argentina, Nigeria, Zimbabwe, South Africa, Australia, New Zealand. (16)

**Locations (Provinces/States)**: Alberta, British Columbia, Ontario, Quebec. Alaska, New York, Florida, Colorado, Montana, Iowa, Kentucky, Washington, Louisiana, California, Maine, North Carolina, Georgia, Texas, Vermont. (20)

**Locations (Famous Rivers)**: Colorado, Hudson, Columbia, Ganges, Danube, Volga, Rhine, Amazon, Mississippi, Mekong, Rio Grande, Nile, Congo, Yangtze, Yellow, Brahmaputra. (16)

##Morphology Descriptors

**Landscapes:** rural, urban, city, desert, field, grass, swamp, forest, mountain, canyon, valley. (11)

**Gradient:** tidal, flat, calm, pool, riffle, rapids, steep, waterfall. (8)

**Dimensions:** narrow, wide, small, large. (4)

**Geology:** clay, mud, sand, pebble, gravel, cobble, boulders, rock, shale, bedrock, limestone, granite. (12)

##Environmental Descriptors

**Water Color:** clear, brown, dirty, muddy, murky, tannic, whitewater. (7)

**Flow Condition:** dry, drought, low flow, flowing, high flow, flooding. (6)

**Weather:** sunny, overcast, cloudy, rainy, snowy. (5)

**Season:** spring, summer, fall, autumn, winter. (5)

Total:

In [None]:
#General descriptors
watercourses = ['ditch','canal','brook','creek','river','stream']

loc_country = ['Canada','USA','Germany','Scotland','Iceland','Russia',
               'China','India','Brazil','Peru','Argentina','Nigeria','Zimbabwe',
               'South Africa','Australia','New Zealand']

loc_state = ['Alberta','British Columbia','Ontario','Quebec','Alaska','New York',
             'Florida','Colorado','Montana','Iowa','Kentucky','Washington','Louisiana','California',
             'Maine','North Carolina','Georgia','Texas','Vermont']

loc_rivers = ['Colorado','Hudson','Columbia','Mississippi','Danube','Rhine','Seine','Thames','Amazon','Rio Negro',
              'Paraguay','Magdalena','Nile','Niger','Zambezi','Congo','Yangtze','Yamuna','Mekong','Ganges',
              'Sepik','Murray','Waikato','Darling']

#Morphological descriptors
morph_land = ['rural','urban','city','desert','field','grass','swamp','forest','mountain','canyon','valley']
morph_grad = ['tidal','flat','calm','pool','riffle','rapids','steep','waterfall']
morph_dim = ['narrow','wide','small','large']
morph_geol = ['clay','mud','sand','pebble','gravel','cobble','boulders','rock','shale','bedrock','limestone','granite']

#Environmental descriptors
env_water = ['clear','brown','blue','dirty','muddy','murky','tannic','whitewater']
env_flow = ['dry','drought','low flow','flowing','high flow','flooding']
env_weather = ['sunny','overcast','cloudy','rainy','snowy']
env_season = ['spring','summer','fall','autumn','winter']

#perfect_prompt = ['sunny summer dry calm small rural mountain rock whitewater river in Maine USA']


In [None]:
#Run experiments
results_folder = '/content/drive/MyDrive/ControlNet-Experiment/Results2'
neg_prompt = 'grayscale, low quality, painting, people'
num_imgs = 30

for watercourse in watercourses:

  #Generate prompt
  prompt = "{}".format(watercourse)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for country in loc_country:

  #Generate prompt
  prompt = "river {}".format(country)
  print(prompt)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for state in loc_state:

  #Generate prompt
  prompt = "river {}".format(state)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for river in loc_rivers:

  #Generate prompt
  prompt = "{} river".format(river)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for land in morph_land:

  #Generate prompt
  prompt = "{} river".format(land)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for grad in morph_grad:

  #Generate prompt
  prompt = "{} river".format(grad)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for dim in morph_dim:

  #Generate prompt
  prompt = "{} river".format(dim)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for geol in morph_geol:

  #Generate prompt
  prompt = "{} river".format(geol)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for water in env_water:

  #Generate prompt
  prompt = "{} river".format(water)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for flow in env_flow:

  #Generate prompt
  prompt = "{} river".format(flow)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for weather in env_weather:

  #Generate prompt
  prompt = "{} river".format(weather)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)

for season in env_season:

  #Generate prompt
  prompt = "{} river".format(season)
  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)


Generating images for river British Columbia
Generating image  1  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  2  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  3  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  4  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  5  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  6  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  7  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  8  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  9  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  10  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  11  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  12  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  13  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  14  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  15  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  16  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  17  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  18  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  19  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  20  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  21  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  22  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  23  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  24  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  25  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  26  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  27  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  28  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  29  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

Generating image  30  of  30


  0%|          | 0/20 [00:00<?, ?it/s]

'\n\nfor river in loc_rivers:\n\n  #Generate prompt\n  prompt = "{} river".format(river)\n  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)\n\nfor land in morph_land:\n\n  #Generate prompt\n  prompt = "{} river".format(land)\n  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)\n\nfor grad in morph_grad:\n\n  #Generate prompt\n  prompt = "{} river".format(grad)\n  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)\n\nfor dim in morph_dim:\n\n  #Generate prompt\n  prompt = "{} river".format(dim)\n  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)\n\nfor geol in morph_geol:\n\n  #Generate prompt\n  prompt = "{} river".format(geol)\n  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)\n\nfor water in env_water:\n\n  #Generate prompt\n  prompt = "{} river".format(water)\n  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)\n\nfor flow in env_flow:\n\n  #Generate prompt\n  prompt = "{} river".format(flow)\n  gen_imgs(prompt,neg_prompt,results_folder,num_imgs)\n\nfor weather in en

#Part 2 - Caption Analysis
We will analyze the images generated in Part 1 using the BLIP caption generator. For each category and term the most common words occuring the prompts will be analyzed.

In [None]:
#Implement BLIP

# install requirements
import sys
if 'google.colab' in sys.modules:
    print('Running in Colab.')
    !pip3 install transformers==4.15.0 timm==0.4.12
    !git clone https://github.com/salesforce/BLIP
    %cd BLIP
!pip install fairscale

Running in Colab.
Collecting transformers==4.15.0
  Using cached transformers-4.15.0-py3-none-any.whl (3.4 MB)
Collecting timm==0.4.12
  Using cached timm-0.4.12-py3-none-any.whl (376 kB)
Collecting sacremoses (from transformers==4.15.0)
  Using cached sacremoses-0.0.53-py3-none-any.whl
Collecting tokenizers<0.11,>=0.10.1 (from transformers==4.15.0)
  Using cached tokenizers-0.10.3.tar.gz (212 kB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: tokenizers
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mBuilding wheel for tokenizers [0m[1;32m([0m[32mpyproject.toml[0m[1;32m)[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m See above for output.
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
  Building wheel

In [None]:
from PIL import Image
import pandas as pd
import numpy as np
import requests
import torch
from torchvision import transforms
from torchvision.transforms.functional import InterpolationMode
from transformers import AutoProcessor, BlipForConditionalGeneration
from collections import Counter

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

processor = AutoProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")



Downloading (…)rocessor_config.json:   0%|          | 0.00/287 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/506 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/4.56k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/990M [00:00<?, ?B/s]

In [None]:
def get_caption(processor,model,image_loc):
  image = Image.open(image_loc)

  text = ""
  inputs = processor(images=image, text=text, return_tensors="pt")
  outputs = model.generate(**inputs)
  return (processor.decode(outputs[0], skip_special_tokens=True)), outputs

def flatten_list(lst):
    flattened = []
    for item in lst:
        if isinstance(item, list):
            flattened.extend(flatten_list(item))
        else:
            flattened.append(item)
    return flattened

def get_words(processor,outputs):

  words=[]
  for i in range(0,len(outputs)):
    words.append((processor.decode(outputs[i][0],skip_special_tokens=True)).split())
  print(words)

  return(Counter(flatten_list(words)))

In [None]:
#We will save the captions into a table and then a csv

#Category, caption, term 1, count 1, term 2, count 2, ... term n, count n

In [None]:
'''
def get_blip(model, preprocess, results_folder, prompt, num_imgs=30):

  outputs = []

  for i in range(0,num_imgs):

    image_loc = "{}/{}/{}.jpg".format(results_folder,prompt,i)
    caption, output = get_caption(processor, model, image_loc)
    print(i," ",caption)
    outputs.append(output)

  word_list = get_words(processor,outputs)
  print(word_list)
  return word_list
  '''

In [None]:
results_folder = '/content/drive/MyDrive/ControlNet-Experiment/Results2'
#See if results already exist in file
df_file ="{}/captions.csv".format(results_folder)
data = pd.read_csv(df_file)
print(data.dtypes)


Unnamed: 0     int64
prompt        object
i              int64
caption       object
file          object
dtype: object


In [None]:
#Function to get all captions for a prompt and save to csv file
def get_blip(model, preprocess, results_folder, prompt, num_imgs=30):

  outputs = pd.DataFrame(index=np.arange(0, num_imgs), columns=('prompt','i','caption','file'))

  #Load already completed files
  df_file ="{}/captions.csv".format(results_folder)
  data = pd.read_csv(df_file)

  for i in range(0,num_imgs):

    image_loc = "{}/{}/{}.jpg".format(results_folder,prompt,i)

    #if results already exist skip the file
    if data['file'].str.contains(image_loc).any():
      print("Caption already exists for ",prompt,i)
    else:
      caption, output = get_caption(processor, model, image_loc)
      print(i," ",caption)
      outputs.loc[i] = [prompt,i,caption,image_loc]

  #Save to csv
  print("Updating captions.csv file")
  output_path = "{}/captions.csv".format(results_folder)
  outputs.to_csv(output_path, mode='a', header=not os.path.exists(output_path))

  return outputs

In [None]:
word_lists = []
results_folder = '/content/drive/MyDrive/ControlNet-Experiment/Results2'

for watercourse in watercourses:

  #Generate prompt
  prompt = "{}".format(watercourse)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for country in loc_country:

  #Generate prompt
  prompt = "river {}".format(country)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for state in loc_state:

  #Generate prompt
  prompt = "river {}".format(state)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for river in loc_rivers:

  #Generate prompt
  prompt = "{} river".format(river)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for land in morph_land:

  #Generate prompt
  prompt = "{} river".format(land)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for grad in morph_grad:

  #Generate prompt
  prompt = "{} river".format(grad)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for dim in morph_dim:

  #Generate prompt
  prompt = "{} river".format(dim)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for geol in morph_geol:

  #Generate prompt
  prompt = "{} river".format(geol)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for water in env_water:

  #Generate prompt
  prompt = "{} river".format(water)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for flow in env_flow:

  #Generate prompt
  prompt = "{} river".format(flow)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for weather in env_weather:

  #Generate prompt
  prompt = "{} river".format(weather)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))

for season in env_season:

  #Generate prompt
  prompt = "{} river".format(season)
  print(prompt)
  word_lists.append(get_blip(model, processor, results_folder,prompt))



clay river
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Results2/clay river/0.jpg
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Results2/clay river/1.jpg
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Results2/clay river/2.jpg
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Results2/clay river/3.jpg
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Results2/clay river/4.jpg
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Results2/clay river/5.jpg
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Results2/clay river/6.jpg
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Results2/clay river/7.jpg
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Results2/clay river/8.jpg
Caption already complete for  /content/drive/MyDrive/ControlNet-Experiment/Resul



0   a stream flows through a rocky valley
1   a river in the mountains
2   a stream in the forest
3   a river in the mountains
4   a river in the mountains
5   a river in the mountains
6   a river in the mountains
7   a river in the mountains
8   a stream in the mountains
9   a stream in the mountains
10   a river in the mountains
11   a river in the mountains
12   a river in the mountains
13   a river in the mountains
14   a stream in the woods
15   a river in the woods
16   a stream in the mountains
17   a stream in the mountains
18   a lake in the mountains
19   a stream in the forest
20   a river in the mountains
21   a river in the mountains
22   a river in the woods
23   a river in the mountains
24   a river in the mountains
25   the river in the mountains
26   a river in the mountains
27   a river in the fall
28   a river in the mountains
29   a stream in the mountains
Updating captions.csv file


'\n\nfor water in env_water:\n\n  #Generate prompt\n  prompt = "{} river".format(water)\n  print(prompt)\n  word_lists.append(get_blip(model, processor, results_folder,prompt))\n\nfor flow in env_flow:\n\n  #Generate prompt\n  prompt = "{} river".format(flow)\n  print(prompt)\n  word_lists.append(get_blip(model, processor, results_folder,prompt))\n\nfor weather in env_weather:\n\n  #Generate prompt\n  prompt = "{} river".format(weather)\n  print(prompt)\n  word_lists.append(get_blip(model, processor, results_folder,prompt))\n\nfor season in env_season:\n\n  #Generate prompt\n  prompt = "{} river".format(season)\n  print(prompt)\n  word_lists.append(get_blip(model, processor, results_folder,prompt))\n\n'

In [None]:
import pandas as pd
from collections import Counter
import re

def drop_words_from_list(original_list, words_to_drop):
    return list(filter(lambda word: word not in words_to_drop, original_list))

def get_most_common_words_per_category(df, category_column, text_column, n=10):

    #drop duplicate rows
    df = df.drop_duplicates()

    # Step 1: Group the DataFrame by the category column
    grouped_df = df.groupby(category_column)

    # Initialize an empty dictionary to store the most common words for each category
    most_common_words_per_category = {}

    for category, group in grouped_df:
        # Step 2a: Tokenize the text within each group into individual words
        words = []
        for text in group[text_column]:
            words.extend(re.findall(r'\b\w+\b', text.lower()))

        #Step 2b: Remove common words
        words_to_drop = ['the','in','it','by','at','is','and','a','of','on','with','that']
        words = drop_words_from_list(words,words_to_drop)

        # Step 3: Count the occurrences of each word within each group
        word_counts = Counter(words)

        # Step 4: Find the most common words for each group
        most_common_words = word_counts.most_common(n)
        print(category, most_common_words)

    '''

        # Store the most common words for the current category in the dictionary
        most_common_words_per_category[category] = most_common_words

    # Create a DataFrame from the dictionary, handling cases where a category has fewer unique words than n
    df_most_common = pd.DataFrame(most_common_words_per_category).fillna('').astype(str)

    # Rename the columns with rank suffixes (e.g., 'word_1', 'word_2', etc.)
    df_most_common.columns = [f"word_{i+1}" for i in range(n)]

    return df_most_common
    '''
    return None

# Example usage
df_file ="{}/captions.csv".format(results_folder)

data = pd.read_csv(df_file)

# Find the top 3 most common words for each category and output as a DataFrame
most_common_per_category_df = get_most_common_words_per_category(data, category_column='prompt', text_column='caption', n=20)
#print(most_common_per_category_df)

Amazon river [('amazon', 26), ('river', 24), ('rainforest', 16), ('jungle', 5), ('boat', 5), ('water', 3), ('trees', 3), ('background', 2), ('red', 2), ('lake', 1), ('mountains', 1), ('full', 1), ('tourists', 1), ('houseboat', 1), ('growing', 1), ('banks', 1), ('people', 1), ('traveling', 1), ('down', 1), ('runs', 1)]
Colorado river [('canyon', 29), ('river', 24), ('grand', 24), ('colorado', 16), ('horseshoe', 4), ('desert', 3), ('green', 3), ('water', 3), ('rio', 2), ('national', 2), ('park', 2), ('canyons', 2), ('overlooks', 1), ('over', 1), ('lake', 1), ('area', 1), ('small', 1), ('boat', 1), ('middle', 1)]
Columbia river [('mountains', 19), ('lake', 18), ('surrounded', 17), ('river', 10), ('trees', 10), ('columbia', 6), ('view', 4), ('from', 3), ('boat', 3), ('top', 2), ('mountain', 2), ('gorge', 1), ('middle', 1), ('valley', 1), ('autumn', 1), ('traveling', 1), ('down', 1), ('flowers', 1), ('water', 1), ('background', 1)]
Congo river [('river', 19), ('amazon', 8), ('water', 7), ('