<center>
<h1><b>Regional Language Weather Chatbot</b></h1>
<h3>Chatbot to address users' weather related queries in Gujarati language</h3>

<img src="https://miro.medium.com/max/800/1*QVnVYYqQ6Wx4B74kOM-VFQ.png" width=400px>
</center>

This is a simple weather chatbot developed to address users' weather related queries in Gujarati language (one of the many regional languages of India).

The chatbot is developed using Python, making use of packages such as NLTK (for natural language processing), PyOWM (for getting weather information from OpenWeatherMap API), and googletrans (for translating user input and output).

Users can ask simple weather related queries such as "What is the weather today?" in Gujarati and the chatbot answers the query with appropriate information.

# Part 0: Environment Setup

To develop this chatbot, the following packages and libraries are used:


*   **Natural Language Toolkit (NLTK):** To perform natural language processing functions on user query, such as tokenization and stemming.
*   **Googletrans:** Python wrapper for Google's Translate API, used to translate input and output to the users' language (regional language).
*   **PyOWM:** Python wrapper for OpenWeatherMap's Weather API, which is used to collect current weather and weather forecast data for users' location.

Other libraries required to develop functions and for housekeeping tasks are listed below:


*   **Requests:** Used to get data from URLs through HTTP requests.
*   **String:** For performing string operations.
*   **Datetime:** To handle date and time data in appropriate formats.
*   **JSON:** To work with data obtained from URLS.








In the following code cell, all required packages are installed and imported.

In [None]:
# Install and import required libraries
%%capture
!pip install googletrans==4.0.0rc1                                              # translation of user input and output
!pip install nltk==3.5                                                          # natural language processing operations
!pip install pyowm==3.2.0                                                       # API calls to get weather data
!pip install requests=2.25.1                                                    # data from URLs through HTTP requests
from googletrans import Translator                                              # import Translator class for translation functions
import nltk                                                                     
nltk.download('stopwords', quiet=True)                                          # download all English stopwords
from nltk.corpus import stopwords                                               # import stopwords package from NLTK
from nltk.stem import PorterStemmer                                             # import PorterStemmer class to perform stemming
import string                                                                   # string operations
from pyowm import OWM                                                           # OWM class to get weather data
from datetime import datetime                                                   # handle date and time
import requests
import json                                                                     # work with JSON data

In the next code cell, important queries for the weather chatbot are defined. The chatbot handles user queries based on these pre-defined queries.

In [None]:
# Predefined queries for chatbot
queries = ['what is the weather today',
           'what is the temperature today',
           'what is the maximum / highest temperature today',
           'what is the minimum / lowest temperature today',
           'what is the chance of rain today',
           'what is the chance of rain tomorrow',
           'what is the weather forecast for today',
           'what is the weather forecast for tomorrow',
           'what time is the sunset today',
           'what time is the sunrise tomorrow']

# Part 1: NLP for user queries

In this section, I have written the functions required to handle user input and process it for the next stages of the chatbot.

The user provides input for his/her weather related queries in Gujarati language. In order to process users' queries, these are translated to English for better processing.

The query is then processed using natural language processing operations such as tokenization, stopword removal and stemming.

For this purpose, the following functions are coded:

*   **Translation to English:** Used to translate user input to English.
*   **Translation to Gujarati:** Used to translate output to Gujarati.
*   **Tokenization:** To tokenize (split into individual words) translated user query.
*   **Remove stopwords:** To remove stopwords from user query for more eficient processing.





In the following code cell, a Translator object is created for translation and the target language is set to Gujarati (gu).

In [None]:
# Create Translator object for translation and set target language
translator = Translator()                                                       # create Translator object
target_lang = 'gu'                                                              # set target language as 'gu' for Gujarati

The next 2 functions translate chatbot answer (output) to Gujarati and input user query to English respectively. For this purpose, the Translator object created in the previous code cell is used.

In [None]:
# Function to translate result to target language
def translate_to_target_language(result):
    return translator.translate(result, dest=target_lang).text                  # return Gujarati tranlsation of chatbot's answer

In [None]:
# Function to translate user query to English
def translate_to_EN(user_query):                                                
    return translator.translate(user_query, dest='en').text.lower()             # return English translation of user query (input)

The following function tokenizes the user query, translated to English. First, all punctuations are removed from the input string using the string library.

The input string is then split into its individual words. Each individual word is known as a token. Each token is then stemmed to its root form using an object of the PorterStemmer class.

Finally, a list of tokens, stemmed to their root form are returned by the function for further processsing.

In [None]:
# Function to remove punctuations, tokenize text and stem words in text
def tokenize(text):
    text = text.translate(str.maketrans('', '', string.punctuation))            # remove punctuations from input text
    ps = PorterStemmer()                                                        # create object for PorterStemmer calss
    tokens = []                                                                 # initialise empty list to store tokens
    for token in text.split():
        tokens.append(ps.stem(token))                                           # stem token and append to tokens list
    return tokens                                                               # return list of stemmed tokens

The following function is used to remove all stopwords, such as "am", "are", "of", etc., from input tokenized text. For this, English stopwords in the stopwords package of NLTK library are used.

The function returns a set of tokens of the input text after stopword removal. Hence, only the keywords are retained.

In [None]:
# Function to remove English stopwords from text
def remove_stopwords(tokenized_text):
    stopwords_EN = stopwords.words('english')                                   # all English stopwords
    return set([word for word in tokenized_text if word not in stopwords_EN])   # return set of keywords only

# Part 2: Understanding user query

In this part of the project, functions required to understand user query are developed for the chatbot to be able to answer them.

After processing the user query, a similarity score between the users' query and chatbot's pre-defined queries is calculated in order to get the most similar query.

The following functions are defined:
*   **Jaccard Similarity:** To calculate degree of similarity between user query and pre-defined query based on Jaccard similarity.

<center><img src="https://www.gstatic.com/education/formulas2/355397047/en/jaccard_index.svg"></center>

*   **Most similar query:** To determine the pre-defined chatbot query which is most similar to the input user query.



The following function calculates similarity score based on Jaccard similarity measure. In this case, I have chosen Jaccard similarity measure as it provides the most efficient and robust solution to the problem.

In [None]:
# Function to calculate Jaccard similarity between 2 texts
def jaccard_similarity(set1, set2):
    intersection = set1.intersection(set2)                                      # calculate intersection of 2 sets of tokens
    union = set1.union(set2)                                                    # calculate union of 2 sets of tokens
    return len(intersection)/len(union)                                         # return jaccard similarity score

The following function is used to determine which pre-defined chatbot query is most similar to the users' input query, based on Jaccard similarity measure.

In [None]:
# Function to determine query which is most similar to user query
def most_similar_query(user_query_tokens):
    highest_sim_score = 0.0                                                     # set initial highest similarity score to 0.0
    most_sim_query = ''                                                         # set initial value for most similar query
    for query in queries:
        query_tokens = remove_stopwords(tokenize(query))                        # tokenize, stem and remove topwords from pre-defined query
        sim_score = jaccard_similarity(user_query_tokens, query_tokens)         # calculate similarity score using jaccard_similarity() function
        if sim_score>highest_sim_score:                                         
            highest_sim_score = sim_score                                       # set highest similarity score
            most_sim_query = query                                              # set value for most similar query
    return most_sim_query                                                       # return the pre-defined query which is most similar to users' input query

# Part 3: Getting weather data

The following sections contains functions to get weather data for the users' location and give output to the users.

Based on the most similar query determined in the previous part, the chatbot queries the weather API for appropriate data. Weather data is specific to the users' location (city) which is determined from the users' IP data.

Finally, an appropriate answer is given as output to the user by the chatbot.

The following functions are defined:
*   **User location:** To get current location of user based on IP data.
*   **Weather information:** To get weather data from OpenWeatherMap's weather API and formulate an appropriate answer.

<br><center><img src="https://19yw4b240vb03ws8qm25h366-wpengine.netdna-ssl.com/wp-content/uploads/OPENWEATHER-300x136.png" width=400px><center><br>

The following function determines users' current location based on IP data through the IP Data API. The function returns the users' current city.

In [None]:
# Function to get user location through ipdata API
def get_user_location():
    api_key = 'YOUR-IPDATA-API-KEY'                                             # IP Data API key
    ipdata = requests.get('https://api.ipdata.co?api-key=' + api_key).json()    # get IP data from IPData website through HTTP requests
    return ipdata['city']                                                       # return users' current city

The next code cell configures OpenWeatherMap's weather API using an API key and creates a weather managere object to get weather data from the API.

In [None]:
# Configure PyOWM with OpenWeatherMap API key
APIKEY = 'YOUR-OWM-API-KEY'                                                     # OpenWeatherMap API key
owm = OWM(APIKEY)                                                               # configure with API key
mgr = owm.weather_manager()                                                     # create weather manager object to get weather data

The following function formulates an appropriate answer to the users' input query and returns the output to be displayed to the user. The chatbot's answer is translated back to Gujarati for the user to be able to understand, using the translation function defined in Part 1.

OpenWeatherMap's weather API is used to get weather data for the uers' current location.

In [None]:
# Function to get weather information based on query
def weather_info(query):
    user_city = get_user_location()                                             # get users' location (city)
    weather_data = mgr.weather_at_place(user_city).weather                      # get weather data for users' city
    forecast_3h = list(mgr.forecast_at_place(user_city, '3h').forecast)         # get weather forecast data

    if query in queries:
        index = queries.index(query)
    else:
        index = -1
    if index==0:
        status = str(weather_data.detailed_status)                              # get current weather status
        current_temp = str(weather_data.temperature('celsius')['temp'])         # get current temperature (in degrees celsius)
        max_temp = str(weather_data.temperature('celsius')['temp_max'])         # get maximum temperature (in degrees celsius)
        min_temp = str(weather_data.temperature('celsius')['temp_min'])         # get minimum temperature (in degrees celsius)
        humidity = str(weather_data.humidity)                                   # get humidity level (percentage)
        result = status.capitalize() + " today. Current temperature is " + current_temp + " degrees celsius (Maximum: " + max_temp + "*C, Minimum: " + min_temp + "*C). Humidity today is " + humidity + "%."
        return result                                                           # return chatbot answer
    elif index==1: 
        current_temp = str(weather_data.temperature('celsius')['temp'])         # get current temperature (in degrees celsius)
        result = "The current temperature is " + current_temp + " degrees celsius."
        return result                                                           # return chatbot answer
    elif index==2:
        max_temp = str(weather_data.temperature('celsius')['temp_max'])         # get maximum temperature (in degrees celsius)
        result = "The maximum temperature today is " + max_temp + " degrees celsius."
        return result                                                           # return chatbot answer
    elif index==3:
        min_temp = str(weather_data.temperature('celsius')['temp_min'])         # get minimum temperature (in degrees celsius)
        result = "The minimum temperature today is " + min_temp + " degrees celsius."
        return result                                                           # return chatbot answer
    elif index==4:
        rain_today = weather_data.rain                                          # get data about rain today
        if len(rain_today)==0:
            result = "There is no chance of rain today."            
        else:
            result = "Rain today is " + str(rain_today['1h']) + "."
        return result                                                           # return chatbot answer
    elif index==5:
        rain_tomorrow = weather_data.rain                                       # get data about rain tomorrow
        if len(rain_tomorrow)==0:
            result = "There is no chance of rain tomorrow."
        else:
            result = "Rain tomorrow will be " + str(rain_tomorrow['1h']) + "."
        return result                                                           # return chatbot answer
    elif index==6:
        status = str(forecast_3h[0].detailed_status)                            # get today's forecasted weather status for users' location
        result = status.capitalize() + " forecasted today."                     # formulate answer
        return result                                                           # return chatbot answer
    elif index==7:
        status = str(forecast_3h[-1].detailed_status)                           # get forecasted weather for tomorrow
        result = status.capitalize() + " forecasted tomorrow."                  # formulate answer
        return result                                                           # return chatbot answer
    elif index==8:
        sunset_time_unix = weather_data.sunset_time() + 19800                   # get time of sunset at users' location
        sunset_time = str(datetime.fromtimestamp(sunset_time_unix).strftime('%H:%M:%S'))
        result = "Time of sunset today is " + sunset_time + "."                 # formulate chatbot answer
        return result                                                           # return chatbot answer
    elif index==9:
        sunrise_time_unix = weather_data.sunrise_time() + 19800                 # get time of sunrise tomorrow
        sunrise_time = str(datetime.fromtimestamp(sunrise_time_unix).strftime('%H:%M:%S'))
        result = "Time of sunrise tomorrow will be " + sunrise_time + "."       # formulate chatbot answer
        return result                                                           # return chatbot answer
    else:
        result = "I am sorry, I don't understand your question."
        return result                                                           # return apology if user query is not understood

# Part 4: Building the chatbot

In the final part of this project, I have developed greeting and goodbye functions for the chatbot to greet the user at the start and end of a chat. A greeting and goodbye is printed as output to the user.

The chatbot is run and user query is taken as input from the user in the regional language.

The following functions are defined in this section:
*   **Greeting:** To greet the user at the start of a chat.
*   **Goodbye:** To greet the user at the end of a chat.
*   **Run:** Main function to run the chatbot and start talking.





In [None]:
# Function to greet user at start of chat
def greet():
    greeting = translator.translate("Hello! I am your weather assistant. Ask me queries related to weather. \n\nTo end this chat, say 'goodbye'.", dest=target_lang).text
    print(greeting)                                                             # greet user at start of chat
    
# Function to say goodbye to user at end of chat
def goodbye():
    goodbye = translator.translate("Goodbye! Have a nice day!", dest=target_lang).text
    print('\n\n' + goodbye)                                                     # greet user at end of chat

In [None]:
# Function to run chatbot
def run():
    talking = True                                                              # set boolean variable 'talking' to True at start of chat
    greet()                                                                     # greet the user
    while(talking):
        prompt = translator.translate("Ask your query here", dest=target_lang).text
        user_query_GJ = input('\n\n' + prompt + ": ")                           # get user query as input in Gujarati
        user_query_EN = translate_to_EN(user_query_GJ)                          # translate user query to English
        if user_query_EN=='goodbye' or user_query_EN=='see you soon':           
            talking = False                                                     # set boolean variable 'talking' to False to break loop and end chat
            break
        else:
            user_query_EN_tokens = remove_stopwords(tokenize(user_query_EN))    # tokenize user query and remove stopwords
            chatbot_query = most_similar_query(user_query_EN_tokens)            # get most similary chatbot query for users' input query
            output = translate_to_target_language(weather_info(chatbot_query))  # formulate chatbot's answer (output to user)
            print('\n' + output)                                                # display output to user in Gujarati
    goodbye()                                                                   # greet user at the end of chat
    return

In [None]:
# Start chatbot
run()

નમસ્તે!હું તમારો હવામાન સહાયક છું.મને હવામાનથી સંબંધિત પ્રશ્નો પૂછો.

આ ચેટને સમાપ્ત કરવા માટે, 'ગુડબાય' કહો.


અહીં તમારી ક્વેરી પૂછો: આજે હવામાન

આજે સ્પષ્ટ આકાશ.વર્તમાન તાપમાન 39.0 ડિગ્રી સેલ્સિયસ છે (મહત્તમ: 39.0 * સી, ન્યૂનતમ: 39.0 * સી).આજે ભેજ 30% છે.


અહીં તમારી ક્વેરી પૂછો: આજે સૂર્યાસ્ત

સૂર્યાસ્તનો સમય આજે 18:53:21 છે.


અહીં તમારી ક્વેરી પૂછો: આવજો


આવજો! તમારો દિવસ શુભ રહે!
