   # Neuroscience inspired AI
   
   
   ## Implementing LSTMs using TensorFlow

## Objective - 


To develop hypotheses and working concepts to achieve multimodal inference for scenarios where questions are input via text to images or videos that the model can adequately answer.
The hypothesis will be based on Visual Question Answering (VQA).

##### Visual Question Answering (VQA) is a task that combines computer vision, natural language processing, and deep learning.
 
##### VQA is the phenomenon of freely asking questions in natural language about visual (image/video) content.However, answering these questions requires a wide range of skills. These skills include proper localization and recognition of objects, people, their activities, and common sense.

## Task - 


Given an image, a visual question-answering algorithm allows the machine to answer free-form, Open-ended, natural-language questions about the image.
  

In [1]:
#import image module
from IPython.display import Image

# get the image
Image(url="https://miro.medium.com/max/1400/1*yqBRejJjQeQ55DjHVpTqfA.png", width=500, height=500)

## Experimental Evalution :

(1) Study and model the bias in the Visual7W, COCO and COCO-QA Multiple Choice datasets, 

(2) measure the effect of using visual features from different CNN architectures, 

(3) explore the use of a LSTM as the system’s language model, and 

(4) study transferability of our model between datasets.

In [2]:
#importing necessary libraries and frameworks
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

print(tf.__version__)

2.9.1


## Dataset : Visual Question Answering

#### Visual7W

Visual7W is a large-scale visual question answering (QA) dataset, with object-level groundings and multimodal answers. Each question starts with one of the seven Ws, what, where, when, who, why, how and which. It is collected from 47,300 COCO iamges and it has 327,929 QA pairs, together with 1,311,756 human-generated multiple-choices and 561,459 object groundings from 36,579 categories.

#### COCO (Microsoft Common Objects in Context)

The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

Splits: The first version of MS COCO dataset was released in 2014. It contains 164K images split into training (83K), validation (41K) and test (41K) sets. In 2015 additional test set of 81K images was released, including all the previous test images and 40K new images.

#### COCO-QA

COCO-QA is a dataset for visual question answering. It consists of:

123287 images

78736 train questions

38948 test questions

4 types of questions: object, number, color, location

Answers are all one-word.

## Loading Datasets :

#### Creating Account in Hub

## Model Experiments on 7W QA

In [3]:
!pip install wget



In [4]:
import json
import os
import argparse
import wget

def download_vqa():
    os.system('wget https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Questions_Train_mscoco.zip -P zip/')
    os.system('wget https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Questions_Val_mscoco.zip -P zip/')
    os.system('wget https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Questions_Test_mscoco.zip -P zip/')

    # Download the VQA Annotations
    os.system('wget https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Annotations_Train_mscoco.zip -P zip/')
    os.system('wget https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Annotations_Val_mscoco.zip -P zip/')


    # Unzip the annotations
    os.system('unzip zip/v2_Questions_Train_mscoco.zip -d annotations/')
    os.system('unzip zip/v2_Questions_Val_mscoco.zip -d annotations/')
    os.system('unzip zip/v2_Questions_Test_mscoco.zip -d annotations/')
    os.system('unzip zip/v2_Annotations_Train_mscoco.zip -d annotations/')
    os.system('unzip zip/v2_Annotations_Val_mscoco.zip -d annotations/')
    

In [5]:
download_vqa()

sh: wget: command not found
sh: wget: command not found
sh: wget: command not found
sh: wget: command not found
sh: wget: command not found
unzip:  cannot find or open zip/v2_Questions_Train_mscoco.zip, zip/v2_Questions_Train_mscoco.zip.zip or zip/v2_Questions_Train_mscoco.zip.ZIP.
unzip:  cannot find or open zip/v2_Questions_Val_mscoco.zip, zip/v2_Questions_Val_mscoco.zip.zip or zip/v2_Questions_Val_mscoco.zip.ZIP.
unzip:  cannot find or open zip/v2_Questions_Test_mscoco.zip, zip/v2_Questions_Test_mscoco.zip.zip or zip/v2_Questions_Test_mscoco.zip.ZIP.
unzip:  cannot find or open zip/v2_Annotations_Train_mscoco.zip, zip/v2_Annotations_Train_mscoco.zip.zip or zip/v2_Annotations_Train_mscoco.zip.ZIP.
unzip:  cannot find or open zip/v2_Annotations_Val_mscoco.zip, zip/v2_Annotations_Val_mscoco.zip.zip or zip/v2_Annotations_Val_mscoco.zip.ZIP.


# PK - Data 

In [6]:
!pip install wget



In [7]:
import wget

Training_coco = 'https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Questions_Train_mscoco.zip'
Validation_coco = 'https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Questions_Val_mscoco.zip'
Training_ann = 'https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Annotations_Train_mscoco.zip'
Validation_ann = 'https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Annotations_Val_mscoco.zip'

wget.download(Training_coco)
wget.download(Validation_coco)
wget.download(Training_ann)
wget.download(Validation_ann)

100% [....................................................] 10518930 / 10518930

'v2_Annotations_Val_mscoco (2).zip'

In [8]:
%matplotlib inline
import os, argparse
import cv2, spacy, numpy as np
from keras.models import model_from_json
from keras.optimizers import SGD
import joblib


In [9]:
VQA_model_file_name      = 'models/VQA/VQA_MODEL.json'
VQA_weights_file_name   = 'models/VQA/VQA_MODEL_WEIGHTS.hdf5'
label_encoder_file_name  = 'models/VQA/FULL_labelencoder_trainval.pkl'
CNN_weights_file_name   = 'models/CNN/vgg16_weights.h5'

In [10]:
image_file_name = 'elephant.jpg'
question = u'What animal is in the picture?'

In [11]:
verbose = 0

In [12]:
from spacy.lang.en import English

In [13]:
def get_image_model(CNN_weights_file_name):
    ''' Takes the CNN weights file, and returns the VGG model update 
    with the weights. Requires the file VGG.py inside models/CNN '''
    from models.CNN.VGG import VGG_16
    image_model = VGG_16(CNN_weights_file_name)

In [14]:
 # this is standard VGG 16 without the last two layers
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
image_model.compile(optimizer=sgd, loss='categorical_crossentropy')
return image_model

  super(SGD, self).__init__(name, **kwargs)


NameError: name 'image_model' is not defined

## Conclusion  

## References

https://www.geeksforgeeks.org/insert-image-in-a-jupyter-notebook/#:~:text=first%2C%20change%20the%20type%20of,Edit%20%2D%3E%20insert%20image.


https://paperswithcode.com/dataset/visual7w

https://arxiv.org/pdf/1511.03416v4.pdf


https://paperswithcode.com/dataset/coco


https://paperswithcode.com/dataset/coco-qa

