<center><h1>Visual Question Answering (VQA)</h1></center>

<p>Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. Mirroring real-world scenarios, such as helping the visually impaired, both the questions and answers are open-ended. Since many open-ended answers contain only a few words or a closed set of answers that can be considered as a multiple-choice format. Visual questions selectively target different areas of an image, including background details and underlying context. A Visual Question Answering(VQA) system typically needs a more detailed understanding of the image and complex reasoning than a system producing generic image captions. </p>

<h1>1.Problem Statement:</h1>

<p>Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. 
</p>

<h1>2.Business objectives and constraints:</h1>

<ul>
<li>no low-latency requirement.</li>
<li>Providing accurate answers.</li>

#<h1>3.Data Collection</h1>

<p>Data obtained from: <a href='https://visualqa.org/download.html'>https://visualqa.org/download.html</a></p>

<ul>
  <li>
    COCO Training images : <a href='http://images.cocodataset.org/zips/train2014.zip'>http://images.cocodataset.org/zips/train2014.zip</a></p>
  </li>
    <li>
    VQA Training annotations  : <a href='https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Annotations_Train_mscoco.zip'>https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Annotations_Train_mscoco.zip</a></p>
  </li>
    <li>
    VQA Training questions: <a href='https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Questions_Train_mscoco.zip'>https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Questions_Train_mscoco.zip</a></p>
  </li>

</ul>

In [0]:
#importing packages
import pandas as pd
import numpy as np
import os
import tensorflow as tf
import json
from google.colab import drive

## <h2>3.1 Mounting the Drive (For saving data)</h2>

<p>Google Colab is a free cloud service for Python programming language.and we can access and store files from drive through this platform. </p>

In [0]:
drive.mount('/content/drive/', force_remount=True) #mouting the drive

Mounted at /content/drive/


In [0]:
#declaring variable which are used in entire notebook
currentDirectory = "/content/drive/My Drive/pcase_study_2/"
currentDirectory = currentDirectory + "data/"
os.chdir(currentDirectory)
currentDirectory = ""

## <h2>3.2 Downloading COCO images</h2>

In [0]:
#Downlaoding and Extrcating Images
tf.keras.utils.get_file('train2014.zip', cache_subdir = os.path.abspath('.'), 
                        origin = 'http://images.cocodataset.org/zips/train2014.zip', extract = True)

Downloading data from http://images.cocodataset.org/zips/train2014.zip


In [2]:
os.chdir(currentDirectory + 'train2014/')
print("Total Number Images in COCO Train Dataset: ",len([name for name in os.listdir()]))

Total Number Images in COCO Train Dataset:  82783


## <h2>3.3 Downloading VQA Questions</h2>

In [0]:
#Downlaoding and Extrcating Questions
tf.keras.utils.get_file('v2_Questions_Train_mscoco.zip',cache_subdir=os.path.abspath('.'),
                        origin = 'https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Questions_Train_mscoco.zip',extract = True)

Downloading data from https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Questions_Train_mscoco.zip


'/content/drive/My Drive/pcase_study_2/data/v2_Questions_Train_mscoco.zip'

In [0]:
# read the json file
question_file_path = 'v2_OpenEnded_mscoco_train2014_questions.json'
with open(question_file_path, 'r') as f:
    questions = json.load(f)

print("Total Number Questions is : ",len(questions['questions']))

Total Number Questions in:  443757


In [0]:
print(questions['questions'][np.random.randint(0,443757)])

{'image_id': 31332, 'question': 'What sport are they playing?', 'question_id': 31332000}


<ul>
<li>image_id: Unique Id of the image</li>
<li>question_id: Unique Id of the question</li>
<li>question: Actual question realted to the particular image</li>
</p>


## <h2>3.4 Downloading VQA Annotations</h2>

In [0]:
#Downlaoding and Extrcating annotations
tf.keras.utils.get_file('v2_Annotations_Train_mscoco.zip',cache_subdir=os.path.abspath('.'),
                        origin = 'https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Annotations_Train_mscoco.zip',extract = True)

Downloading data from https://s3.amazonaws.com/cvmlp/vqa/mscoco/vqa/v2_Annotations_Train_mscoco.zip


'/content/drive/My Drive/pcase_study_2/data/v2_Annotations_Train_mscoco.zip'

In [0]:
annotation_file_path = 'v2_mscoco_train2014_annotations.json'
with open(annotation_file_path, 'r') as f:
    annotations = json.load(f)

In [0]:
annotations['annotations'][np.random.randint(0,443757)]

{'answer_type': 'yes/no',
 'answers': [{'answer': 'yes', 'answer_confidence': 'yes', 'answer_id': 1},
  {'answer': 'yes', 'answer_confidence': 'yes', 'answer_id': 2},
  {'answer': 'yes', 'answer_confidence': 'yes', 'answer_id': 3},
  {'answer': 'yes', 'answer_confidence': 'yes', 'answer_id': 4},
  {'answer': 'yes', 'answer_confidence': 'yes', 'answer_id': 5},
  {'answer': 'no', 'answer_confidence': 'yes', 'answer_id': 6},
  {'answer': 'yes', 'answer_confidence': 'maybe', 'answer_id': 7},
  {'answer': 'yes', 'answer_confidence': 'yes', 'answer_id': 8},
  {'answer': 'yes', 'answer_confidence': 'yes', 'answer_id': 9},
  {'answer': 'yes', 'answer_confidence': 'yes', 'answer_id': 10}],
 'image_id': 538149,
 'multiple_choice_answer': 'yes',
 'question_id': 538149001,
 'question_type': 'are the'}

<p> Fields:
<ul>
<li>image_id: Unique Id of the image</li>
<li>question_id: Unique Id of the question related to the image</li>
<li>question_type: Type of the question</li>
<li>multiple_choice_answer: Actual answer</li>
<li>answer_type: Type of the actual answer</li>
<li>answers: Answers from 10 unique persons for a given question</li>
</p>


## <h2>3.4 Saving Data into drive</h2>

In [0]:
drive.flush_and_unmount()#to copy erything to drive