# Introduction to Generative AI

Generative AI can be used to create a variety of content including code. These are computational models which have been trained on large amounts of data. When provided with a prompt, they can generate new content that is similar to the data they were trained on. Different variations of generative AI models have been trained on different datasets and are able produce different types of output. Some common types of generative AI models include:

* **Text Generation**: These models can generate text based on a prompt. They can be used to generate stories, poems, and even code. These tools use Large Language Models (LLMs) like GPT-3, which have been trained on a diverse range of text data.
* **Image Generation**: These models can generate images based on a prompt. They can be used to create art, design, and even realistic photos. Some popular models include DALL-E and BigGAN. These models may be based on Generative Adversarial Networks (GANs) and can use Diffusion models.
* **Music Generation**: These models can generate music based on a prompt. They can be used to create new songs, soundtracks, and even entire albums. Some popular models include MuseNet and OpenAI's Jukebox. These models may be based on Recurrent Neural Networks (RNNs) or Transformers.
* **Video Generation**: These models can generate videos based on a prompt. They can be used to create animations, movies, and even deepfakes. Some popular models include Deepfake and Face2Face. These models may be based on GANs or other deep learning architectures. Video generation is not currently as mature as the other applications.

Understanding and creating these models is a complex process that requires a deep understanding of machine learning, neural networks, and natural language processing. In the last few years, these technologies have matured to a point where the content they generate is useful. Combined with the creation of user-friendly interfaces, these models are now accessible to a wider audience.

In terms of what you need to know about how these models work, the following are important facts:

* They produce things like what they were trained on. 
* They do no think or reason like humans.
* They do not understand the content they generate.

This leads to a number of different issues and considerations when using these tools.

## Limitations and Problems of Generative AI

While generative AI models have many applications, they also have limitations.

### Correctness

Generative AI models can produce incorrect or misleading content. This can be due to errors in the model, biases or incorrect information in the training data, or the limitations of the model architecture. This makes it vital to check the output of these models and not take it at face value. For example, I asked Copilot (which is powered by OpenAI's GPT-4) to solve a simple quadratic equation:
<center>
<img src="resources/quadratic_example.png" alt="Copilot failing to solve the quadratic equation" width="50%">
</center>



It confidently provided an incorrect answer that appeared plausible at first glance. It provided the answers of $\frac{1}{2}$ and $-\frac{5}{4}$ when the correct answers were 0.804 and -1.55. The model does not understand the maths behind the problem and is simply generating text that looks like text one would see in a solution to a quadratic equation.

When AIs imagine false information it is known as "hallucination".

### Bias

Generative AI models can also produce biased content. This is commonly due to biases in the training data, which can be reflected in the output of the model. For example, I asked Copilot for a list of the top 10 famous scientists and it provided the following list:

* Albert Einstein
* Isaac Newton
* Nikola Tesla
* Marie Curie
* Niels Bohr
* James Clerk Maxwell
* Charles Darwin
* Galileo Galilei
* Rosalind Franklin

This list is comprised entirely of Western scientists, with no representation from other parts of the world. This is an example of bias in the output of a generative AI model and probably occurs because the internet has more information about Western scientists than scientists from other parts of the world.

### Easy to Confuse

Generative AI can be misled by incorrect or ambiguous prompts. For example, I asked Copilot what the westernmost point of Europe was, and contradicted its answer:

<center>
<img src="resources/western_europe_example.png" alt="Arguing with AI over the westernmost point of Europe" width="50%">
</center>


Copilot was not confident enough in its original correct answer and changed it to a different answer when I contradicted it. This means that, if you push an AI in particular direction, deliberately or accidentally, it may provide incorrect information.

### Ethics Considerations

Generative AI models can also raise ethical concerns:

* Generative AI may be misused to generate harmful content, such as fake news, deepfakes, or hate speech. 
* Generative AI can be used to automate tasks that may have negative consequences, such as spamming, hacking, or surveillance.
* Generative AI may be used to infringe on intellectual property rights, such as copyright or trademarks. Models may have been trained on copyrighted data, or may generate content that infringes on existing intellectual property. This can viewed as a form of stealing and can damage the livelihoods of creators.
* Training Generative AI models requires large amounts of computational resources, which can have a significant carbon footprint. This can contribute to climate change and other environmental issues.
* Generative AI models makes it easier to create a dishonest representation of a person's abilities or knowledge. For example, a job applicant could use a generative AI model to generate a fake portfolio of work, or answer to interview questions.
* Incorporation of content produced by generative AI models into research publications risks the introduction of hallucinations and mistakes into the scientific record. This could hamper the progress of science in the long-term.

### Academic Integrity

The question of how to integrate generative AI into academic work is not simple, and is one educators are grappling with. Generative AI is a useful tool that future employees will need to be familiar with in many fields. It can also be very helpful to students learning new concepts through the generation of examples and explanations.

However, it is also tempting for students to use it as a replacement for gaining their own understanding and base on knowledge in a topic by using it to complete assignments or exams. This can lead to a lack of understanding of the material and a lack of critical thinking skills.

Different institutions have taken different approaches to this issue. Some have banned the use of generative AI tools in academic work, while others have embraced them as a tool for learning. At Imperial, you may include content generated by generative AI in your assessed work, but you must clearly cite it, just as you would any other source. This means using generative AI for writing entire essays, reports or code assignments in not allowed. It is also impractical to use Generative AI to edit your work, or regularly generate small parts of it, as you would need to cite the AI every time.

In your academic work, you should use generative AI in a responsible way that is consistent with Imperial's rules on academic integrity, and in a way that helps rather than impedes your ability to learn and perform research.

If you want to know more about Imperial's rules and guidance on generative AI, there is general guidance [here](https://www.imperial.ac.uk/students/academic-support/ai-and-study-guidance-hub/) and  advice from the library on how to reference your use of generative AI tools [here](https://www.imperial.ac.uk/admin-services/library/learning-support/generative-ai-guidance/).

## Exercise

Copilot is a powerful Generative AI Tool, based on OpenAI's GPT-4 model. Imperial has a institutional license for Copilot, giving you access to it. Follow [this link](https://copilot.microsoft.com/) to it and log in with your Imperial account. Check that there is an Imperial logo in the top-left of the screen and a green logo saying "Protected" in the top-right. This means you are using the Imperial license and your data isn't stored by Microsoft.

Spend a few minutes asking Copilot questions and seeing what it can do. Specifically:

* Ask it questions you know the answer to, to see if it gets them right.
* Try to confuse it by asking it questions with ambiguous or incorrect information.
* ask it a question you think might generate a biased answer.
* Ask it to generate some code for you, and see if it is useful.