#  Mini Program - googleSearch using Python
## Objective - 
### This program takes the keyword as input from the user and uses Google to search the content. It then parses the html page and displays the content of the  'knowledge Panel'

###  Step 1 - Import required libraries
#### Libraries are the modules that are available in Python to help programmer perform specific operations. 
#### In this program, we are making use 3 libraries 
#### 1. requests - This is an Apache2 licensed HTTP library. It helps to make http request simpler and user friendly
#### 2. BeautifulSoup - This is a part of bs4 package and it helps to pull the data easily from html/xml tags
#### 3. InsecureRequestWarning - This is used to suppress the warning arising due to unverified https request

In [1]:
#importing libraries
import requests
from bs4 import BeautifulSoup
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning) #This disables the warning arising due to unverified https request

###  Step 2 - Take input from the user
#### Writing an interactive code is an important part of a program. 
#### In Line 1 - We will ask the user to enter the keyword to be searched instead of hardcoding it
#### In Line 2 - We convert the character space entered by the user into it's ASCII code
#### In Line 3 - We create the google search query string using the input given by the user
#### In Line 4 - We are printing to see if the query string is in proper format. Printing helps in debugging the code

In [2]:
query_word = input("Enter the word to be searched : ")
query_word = query_word.replace(" ","%20")
query_string = 'https://www.google.com/search?q=' + str(query_word) #str function is used to conver any type of input entered by the user to string type
print(query_string)

Enter the word to be searched : Chennai Super Kings
https://www.google.com/search?q=Chennai%20Super%20Kings


### Step 3 - Making HTTP request and storing the response
#### In Line 1 - We are making HTTP request for the url created above and storing response in a variable

In [3]:
response = requests.request("GET", query_string, verify = False)

### Additional Info - If you are running this code in a proxy network, you have to setup the proxy before making the HTTP request
#### Enter the proxy details in http_proxy and https_proxy
#### Comment the code above this markdown and uncomment the 4 line codes below this markdown

In [56]:
#http_proxy = "http://xxx.xxx.xx.xx:yy"
#https_proxy = "https://xxx.xxx.xx.xx:yy"
#proxyDict = { "http" : http_proxy, "https" : https_proxy }
#response = requests.request("GET", query_string, proxies=proxyDict,verify = False)

### Step 4 - Using BeautifulSoup to parse through the web response
#### In Line 1 - We will convert the text response into proper html format that can be parsed
#### In Line 2 - We will find the text inside div tag which is defined by class - mraOPb. 
##### This class contains the description of the keyword present in the knowledge pane
#####  Some query searches will not yield result in Knowledge Pane. We will handle it by getting the first link output as response
#### In Line 3 - We are storing the text extracted from step 2 into a variable
#### In Line 4 - We are printing the result

In [4]:
soup = BeautifulSoup(response.text,'html.parser')
desc = soup.find('div',attrs={'class':'mraOPb'}) #find function is used to extract the text in the required div tag
if desc is not None:
    desc_text = desc.text.strip() #desc.text stores the text reqtured from above extract. strip() function is used to remove leading and trailing the spaces
    print(desc_text)
else:
    desc1 = soup.find('span',attrs={'class':'st'})
    print(desc1.text.strip())

The Chennai Super Kings is an Indian franchise cricket team based in Chennai, Tamil Nadu, which plays in the Indian Premier League. Founded in 2008, the team plays its home matches at the M. A. Chidambaram Stadium in Chennai. Wikipedia
