# The Grammar Checker

By Ritesh, Sunny and Tahseen

## What is a Grammar Checker?

"In my free time, I like to go out in a hike or two. After that, I usually get some food because I am hungry from hiking. I like to visit waterfalls, they are quite peaceful to be around, as well as magnificent to look at them."

**QuillBot:**

![Quillbot Example](images/quillbot_sc.png)

**Writer:**

![Writer Example](images/writer_sc.png)

**GrammarCheck:**

![GrammarCheck Example](images/grammarcheck_sc.png)

**Grammarly:**

![Grammarly Example](images/grammarly_sc.png)

So, what is a grammar checker?

## The Theory Behind Grammar Checking

### Rule-Based Checking:

- A set of well-defined grammar rules designed by linguistic experts

- Text tagged with parts of speech (POS) checked against rules to find error and to also find a rule to correct the error.

**Pros:**

- Easy to add, edit and delete rules.

- Detailed errors which is helpful for computer aided language learning.

**Cons:**

- Intense manual effort to create and validate these rules.

- Manual maintenence of hundreds of grammar rules is tedious and expensive.

### Machine Learning Technique:

- Most popular technique for grammar checking at the moment.

- Supervised learning models study annotated corpus (large piece of text)

- Performs statistical analysis to detect and correct grammatical errors.

**Pros:**

- Does not require extensive domain knowledge of the grammar (can be applied to any language).

- Can identify more complex issues in writing (tone, word choice, clarity etc.).

**Cons:**

- Requires a large annotated corpus, otherwise model is unusable for grammar checking.

- Performance of model greatly depend on how 'clean' the corpus is.

### Hybrid Learning Technique:

- Combines both rule-based and ML technique to address a wide range of complex errors.

- Rule-based is typically better at catching simpler errors while machine learning can identify more complex errors (e.g determiner errors)

- Reduces the workload in making hand-made rules.

## Implementation

How Jupyter Notebook works:

![Notebook Components (source [1])](images/notebook_components.png)

### Jupyter Extensions:

![Screenshot of Jupyter Extensions Tab](images/jupyter-extensions-example.png)

### Jupyter Extension Requirements:

![Extension Requirements](images/jupyter-requirements.png)

### Libraries Used

<div style="text-align: cetner" align="center"><img style="margin:auto" src="images/lt_logo.png" /></div>

![Re-Mark](images/remark_logo.svg)

### Markdown

```
Your very <b>nice</b>.
```

### AnnotatedText 

```
{"annotation":[{"interpretAs":"","markup":"","offset":{"end":0,"start":0}},{"offset":{"end":10,"start":0},"text":"Your very "},{"interpretAs":"","markup":"<b>","offset":{"end":13,"start":10}},{"offset":{"end":17,"start":13},"text":"nice"},{"interpretAs":"","markup":"</b>","offset":{"end":21,"start":17}},{"offset":{"end":22,"start":21},"text":"."}]}
```

### Grammar check response

```
[{"offset": 0, "errorLength": 4, "message": "Did you mean you're (short for you are)?"}]
```

### Extension Architecture

![Diagram of Implementation](images/system_design.png)

## Demonstration


**Disclaimer!**

The LanguageTool API is not perfect! It (unfortunately) misses some grammatical errors that other tools (such as Grammarly) catches.

## Statistics

**Errors per Chapter:**

(Spelling errors, Grammar Errors)

0. 32, 21
1. 54, 89
2. 14, 58
3. 27, 85
4. 12, 11
5. 4, 30
6. 12, 103
7. 6, 84
8. 1, 13
9. 9, 64

## Thank You for Watching! :)

## Bibliography

1. Soni, M., & Thakur, J. S. (2018). A systematic review of automated grammar checking in English language. arXiv preprint arXiv:1804.00540.

## Image Sources / Repos:

1. https://docs.jupyter.org/en/latest/projects/architecture/content-architecture.html

2. https://github.com/remarkjs/remark

3. https://www.npmjs.com/package/annotatedtext-remark

## Grammar Checkers

1. https://quillbot.com/
2. https://writer.com/grammar-checker/
3. https://www.grammarcheck.net/editor/
4. https://www.grammarly.com/
5. https://languagetool.org/?msclkid=45b2b96dc1f911eca79f2a519e5fb6a8