# Table Question Answering
Here the language models are specifically trained for understanding a given table and answer accordingly. Here we will be using google's TAPAS model for this.

Installing libraries

In [9]:
!pip install transformers
!pip install torch-scatter -f https://data.pyg.org/whl/torch-1.9.0+${CUDA}.html


Looking in links: https://data.pyg.org/whl/torch-1.9.0+.html


Importing libraries

In [11]:
from transformers import AutoModelForTableQuestionAnswering, AutoTokenizer, pipeline
import pandas as pd

For this we will use sample data file from Ookla which shows fastest ISPs by city in 2017.
Source : https://www.lifehacker.com.au/2017/11/revealed-the-fastest-isps-in-each-australian-city/

Download the data.

In [12]:
!wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=1TxyQU9v16GfLz10NvnyJZhgVwECxmyqe' -O 'data_ISP.csv'
data = pd.read_csv(r"data_ISP.csv")

--2022-04-08 11:43:58--  https://docs.google.com/uc?export=download&id=1TxyQU9v16GfLz10NvnyJZhgVwECxmyqe
Resolving docs.google.com (docs.google.com)... 74.125.143.113, 74.125.143.102, 74.125.143.101, ...
Connecting to docs.google.com (docs.google.com)|74.125.143.113|:443... connected.
HTTP request sent, awaiting response... 303 See Other
Location: https://doc-00-8g-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/4il2bf1lrhosfu84pgf1v1ust51qho27/1649418225000/03983047858725985766/*/1TxyQU9v16GfLz10NvnyJZhgVwECxmyqe?e=download [following]
--2022-04-08 11:43:59--  https://doc-00-8g-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/4il2bf1lrhosfu84pgf1v1ust51qho27/1649418225000/03983047858725985766/*/1TxyQU9v16GfLz10NvnyJZhgVwECxmyqe?e=download
Resolving doc-00-8g-docs.googleusercontent.com (doc-00-8g-docs.googleusercontent.com)... 108.177.126.132, 2a00:1450:4013:c01::84
Connecting to doc-00-8g-docs.googleusercontent.com (doc-00-8g-docs.goo

In [13]:
data

Unnamed: 0,City,Download (Mbps),Upload (Mbps),Fastest ISP,Speed Score
0,"Adelaide, South Australia",21.93,10.74,TPG,31.09
1,"Brisbane, Queensland",35.08,23.56,Optus,49.19
2,"Canberra, Australian Capital Territory",32.47,14.12,¡¡Net,35.21
3,"Darwin, Northern Territory",29.62,13.89,¡¡Net,34.34
4,"Geelong, Victoria",67.05,22.15,¡¡Net,94.26
5,"Gold Coast, Queensland",32.17,9.07,Optus,91.37
6,"Hobart, Tasmania",27.25,11.74,Telstra,27.8
7,"Melbourne, Victoria",31.63,20.58,Spirit,44.29
8,"Newcastle, New South Wales",33.97,14.74,MyRepublic,57.36
9,"Perth, Western Australia",17.9,7.59,TPG,28.44


Lets convert dataframe to a string to be passed through language model.

In [14]:
data = data.astype(str)

Prediction

In [15]:
# Load model & tokenizer
model = 'google/tapas-base-finetuned-wtq'
tapas_model = AutoModelForTableQuestionAnswering.from_pretrained(model)
tapas_tokenizer = AutoTokenizer.from_pretrained(model)

# Initializing pipeline
nlp = pipeline('table-question-answering', model=tapas_model, tokenizer=tapas_tokenizer)


def qa(query,data):
    print('>>>>>')
    print(query)
    result = nlp({'table': data,'query':query})
    answer = result['cells']
    print(answer)


In [16]:
prediction = qa('What is the highest download speed',data)

>>>>>
What is the highest download speed
['67.05']


In [17]:
prediction = qa('Which city has the highest download speed',data)

>>>>>
Which city has the highest download speed
['Geelong, Victoria']


In [18]:
prediction = qa('fastest ISP of queensland?',data)

>>>>>
fastest ISP of queensland?
['Optus']


In [19]:
prediction = qa('Which city has the highest speed score?',data)

>>>>>
Which city has the highest speed score?
['Wollongong, New South Wales']
