# AI Coding Assistants

![](../graphics/artwork/copilot.png)

2022: AI Coding Assistants are entering the market

- [*GitHub Copilot*](https://github.com/features/copilot/)
- [*tabnine*](https://www.tabnine.com)
- [*codiga*](https://www.codiga.io)
- ...

What is the technology is behind them? How do they perform? And how will they impact the work of developers?

## Sandbox 

[**GitHub Copilot Sandbox**](ai-copilot-sandbox.ipynb)

## Discussion: Will AI change the way we code (for the better)?

_Do you have experience with AI coding assistants? What are your thoughts on the topic? Which advantages and disadvantages do you see?_

## What powers GitHub Copilot?

The machinery behind GitHub Copilot is [**OpenAI Codex**](https://openai.com/blog/openai-codex/), a large language model trained on source code. 

- OpenAI codex is a descendant of [**GPT-3**](https://en.wikipedia.org/wiki/OpenAI#GPT-3), a large language model trained on natural language text from the web.
- The model is trained on 1.5 billion lines of code from GitHub
  - ... and **fine tuned** on a smaller, curated set of high quality code
- OpenAI Codex has much of the natural language understanding of GPT-3, but it produces working code


## Sequence Learning Models

- Training a _language model_ is a **sequence learning** task
  - **sequence learning**: any machine learning task where the input is a sequence of values or tokens

- **vector-to-sequence**: 
  - e.g. _image captioning_: image -> sequence of words
  
- **sequence-to-vector**: 
  - e.g. _sentiment analysis_: sequence of words -> sentiment
  
- **sequence-to-sequence**:
  - e.g. _machine translation_: sequence of words in one language -> sequence of words in another language
  - e.g. _text summarization_: sequence of words -> sequence of words (shorter)


## Evolution of Sequence Learning

RNN -> LSTM -> Transformer

### Recurrent Neural Network (RNN)

Unlike **feedforward neural network**, a **recurrent neural network** can use its internal state to process sequences of inputs. That gives the network _sequential memory_.

![](https://www.researchgate.net/profile/Dana-Hughes/publication/305881131/figure/fig5/AS:391681317851147@1470395511494/Feed-forward-and-recurrent-neural-networks.png)

pros:
- can process sequences of inputs

cons:
- slow to train (sequential processing of sequence)
- bad at learning from long sequences - "short memory" 
  - due to _vanishing gradient problem_



### LSTM Networks

- **Long Short-Term Memory** is a recurrent neural network architecture. 
- LSTM networks contain **LSTM units**, in which a **memory cell** can store values and **gates** regulate read, write and delete operations on the cell. 
- During training, the LSTM network can learn how to operate memory in order to remember the information that is relevant for the task. 



pros:
- can learn from _longer_ sequences of inputs
  - by solving the _vanishing gradient problem_
  
cons:
- slower to train (sequential processing of sequence, more complex than simple RNN)
- not truly bidirectional - but for language context left and right of a word is important


### Transformer

- a **transformer** is a **sequence model** which is not a recurrent neural network, instead using an **attention mechanism** 

pros:
- can learn from _even longer_ sequences
  - in theory infinite, constrained by compute resources  
- fast to train (parallel processing of sequence)
- truly bidirectional context

## Introducing Large Language Models (e.g. GPT)

GPT () is a language model. It is based on the **Transformer** architecture. The model was trained on a large corpus of code snippets and can be used to generate code snippets. 

**What is a language model?**

Language models have a large variety of use cases:
- **text generation**
- **text classification**
- **text summarization**
- **question answering**
- **machine translation**
- **speech recognition**


A language model has learned a **probability distribution** over sequences of words.


![](https://jalammar.github.io/images/xlnet/gpt-2-autoregression-2.gif)
_Source: [The Illustrated GPT-2 (Visualizing Transformer Language Models)](https://jalammar.github.io/illustrated-gpt2/)_



![](https://miro.medium.com/max/582/1*C-KNWQC_wXh-Q2wc6VPK1g.png)
<br>
_Source: [GPT-3: The New Mighty Language Model from OpenAI](https://towardsdatascience.com/gpt-3-the-new-mighty-language-model-from-openai-a74ff35346fc)_

## OpenAI Codex

todo?


## Will AI change the way we code (for the better)? - the evidence so far

- Does GitHub Copilot lead to higher productivity?
  - GitHub's own user study says yes
  - independent studies are so far inconclusive
- Does GitHub Copilot lead to better code?
  - Copilot may reproduce security flaws: [_Do Users Write More Insecure Code with AI Assistants?_](https://arxiv.org/pdf/2211.03622.pdf)
   
    > "We observed that participants who had access to the AI assistant were more likely to introduce security vulnerabilities for the majority of programming tasks, yet also more likely to rate their insecure answers as secure compared to those in our control group."

    > "Additionally, we found that participants who invested more in the creation of their queries to the AI assistant, such as providing helper functions or adjusting the parameters, were more likely to eventually provide secure solutions."

### References

- [The Illustrated GPT-2 (Visualizing Transformer Language Models)
](https://jalammar.github.io/illustrated-gpt2/)
- [Transformer Neural Networks - EXPLAINED! (Attention is all you need)](https://www.youtube.com/watch?v=TQQlZhbC5ps)

---
_This notebook is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright © 2022 [Christian Staudt](https://clstaudt.me)_