# Question Classifier - A Rule Based Approach

<b>Why Rule Based?</b><br>
I have used the rule based approach since here the cases/vocabulary is limited.<br>
Pattern matching using <b>regex</b> would definetly do a better job than using ml/dl models trained on a typically small dataset<br>

ML/ DL models could be used when it becomes very difficult to cover all the cases. Here, the cases are limited. ML/DL models would also require a large dataset having million datapoints to typically perform well <br>


Task:-<br>
Build a question classifier model that takes a sentence as input and predict whether the statement is a question or not.<br>
For example:<br>
statement: do you like food?<br>
predicted_question : 1<br>
statement: The boy who sat beside him was his son.<br>
predicted_question : 0<br>

Here our model's name is <b>predictor()</b>.<br>
<b>predictor(sentence)</b> is basically a function that takes a sentence as an input<br>
You need to keep calling the function predictor() by passing a string to it.<br>

<b>Output</b>: predicted_question:1 if the sentence is a question <br>
                  predicted_question:0 if the senetnce is just a statement.<br>





In [1]:
import re

head_words=['who','which','what','where','when','how','why','whose','is']

before_prep=['are','did','may','shall','will','could','do','is','would','should','shouldn\'t','couldn\'t','wouldn\'t','isn\'t','aren\'t','arent','couldnt','wouldnt','isnt','shouldnt','don\'t']
prepositions=['i','we','us','you','your','them','they','he','she','these','those','their','someone','somebody','it']

sub_words=['which','what','where','when','how','why','whose']



compiled_words=re.compile(r'\b(?:%s)\b' % '|'.join(head_words))

compiled_sub_words=re.compile(r'\b(?:%s)\b' % '|'.join(sub_words))


def cleanpunc(sentence): #function to clean the word of any punctuation or special characters
    cleaned = re.sub(r'[\.|!|"|#]',r' ',sentence)
    cleaned = re.sub(r'[\.|,|)|(|\|/]',r' ',cleaned)
    return  cleaned


def predictor(sentence):
    
    sentence=sentence.lower()
    sentence=cleanpunc(sentence)
    

    #Matching the head of the string with head_words
    result1=compiled_words.match(sentence)
    
    #Matching subwords
    result2=compiled_sub_words.search(sentence)
    
    
    if type(result1).__name__ != "NoneType":
        return 1
    
    
    elif type(result2).__name__ != "NoneType":
        return 1
    
    elif questions_before_prep(sentence):
        return 1
    
    elif "?" in sentence:
        return 1
    
    else:
        return 0
    
  
    
    
    
    
    
    



  


In [2]:
def questions_before_prep(sentence): #This function is used to check for prepositions after questioning words
    compiled_before_prep=re.compile(r'%s\b' % '|'.join(before_prep))
    compiled_prep=re.compile(r'%s\b' % '|'.join(prepositions))
    
    
   
    
    ans=False
    
    result4=compiled_before_prep.findall(sentence)
    result_prep=compiled_prep.findall(sentence)
   
 
   
    s=0
    i=0
    
    
    if len(result4)>0:
        
        for w in result4:
            
            in1=sentence.find(w,i)
            i=in1+len(w)
            
           
        
            j=0
            if len(result_prep)>0:
            
                for word in result_prep:
                    s=0
                    
                    in2=sentence.find(word,j)
                    j=in2+len(word)
                    
              
                    if in2>in1:
                        space=sentence[in1+len(w):in2]
                    
                        space_result=re.match(r'\s*',space)
                        s=in1+len(w)+len(space_result.group(0))
                        
                      
                        if type(space_result).__name__ != "NoneType":
                            if s==in2:
                                ans=True
                                break
                        
                        else:
                            ans=False
                     
                
            else:
                ans=False
        
                    
                    
        
    else:
        ans=False
        
    return ans
    
            
            
    

# Cases covered

1) Sentences starting with who, which, what, where, when, how, why, whose, shouldn\'t, couldn\'t, wouldn\'t, isn\'t, aren\'t, arent, couldnt, wouldnt, isnt, shouldnt, don\'t.<br>

2) Sentences having questioning words before preposition. This is a rule in grammar for asking questions<br>
    example:-<br> a) Shouldn't we go out?<br>
             b) Is it raining<br>
             c) Neha is alone at home. Could you pick her up<br>
             d) Hi!!! I'm Wysa. May I know your name?<br>
             <b>Notice:</b> Some of the questions don't have a question mark in them<br>
             
3) Sentences having a number of punctions will also be classified as a questionnare or a statement<br>
    example:- <br>
      a) Hey Jimmmy...........How are you?<br>
      b) Hey Wysa!!!!!! I would like to meet you.<br>
      c) I'm not well,,,,,,                could you suggest me some medicines.<br>
      
4) The classifier should understand the difference between interchanged words. This also has been taken care of
   example:-<br>
   a) It is raining.<br>
   b) Is it raining.<br>
    
5) It is not necessary that the starting word in a sentence should be a questioning word. It could be a normal sentence and then    it could have a question in it.<br>
    example:-<br>
    a) It's raining heavily outside, I wanted to leave, should I<br>
    b) Raju is at home, where is Hari<br>


In [3]:

print("predicted_question:",predictor("Hey Jimmmy...........How are you?"))


predicted_question: 1


In [4]:
print("predicted_question:",predictor("Are you fine"))

predicted_question: 1


In [5]:
print("predicted_question:",predictor("Marry is fine"))

predicted_question: 0


In [6]:
print("predicted_question:",predictor("The boy is playing with his toys"))

predicted_question: 0


In [7]:
print("predicted_question:",predictor("It's raining outside. I think I should leave."))

predicted_question: 0


In [8]:
print("predicted_question:",predictor("It's raining outside. I think I should leave. Should I"))

predicted_question: 1


In [9]:
print("predicted_question:",predictor("Hi!!! I'm Wysa. May I know your name"))

predicted_question: 1


In [10]:
print("predicted_question:",predictor("Hi!!! I'm Wysa."))

predicted_question: 0


In [11]:
print("predicted_question:",predictor("Neha is alone at home. Could you pick her up"))



predicted_question: 1


In [12]:
print("predicted_question:",predictor("Is it raining"))

predicted_question: 1


In [13]:
print("predicted_question:",predictor("It is raining"))

predicted_question: 0


In [14]:
print("predicted_question:",predictor("Your name is Rishabh.....Is it"))

predicted_question: 1


In [15]:
print("predicted_question:",predictor("Raju is at home, where is Hari"))



predicted_question: 1


In [16]:
print("predicted_question:",predictor("I'm not well,,,,,, could you suggest me some medicines."))



predicted_question: 1


In [17]:
print("predicted_question:",predictor("do you like food?"))

predicted_question: 1


In [18]:
print("predicted_question:",predictor("The boy who sat beside him was his son."))


predicted_question: 0


In [19]:
print("predicted_question:",predictor("My name is Rishabh..I'm looking for summer internship in the field of Machine Learning"))

predicted_question: 0


In [20]:
print("predicted_question:",predictor("Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it learn for themselves."))

predicted_question: 0
