# Large Language Models

The following are the objectives of this notebook:

1. Present a design for an application that uses a Large Language Model (LLM).
2. Justify the use of this application against a target audience.
3. Critically reflect on social and cultural implications of the application

---

## Part 1 - What Are LLMs?

![LLM](LLM.jpg)
[DICS, Large Language Models](https://dics.co/current-affairs/large-language-models-upsc)

*The following information was pulled from the following resource:*

Stryker, C. 2025. "What are large language models (LLMs)?" IBM. https://www.ibm.com/think/topics/large-language-models

LLMs are a category of deep learning models that are trained from a large amount of data that make them able to generate natural language and other content to perform a variety of tasks. Some of which include, summarizing an article, or debugging code.

An example of a LLM is Microsoft Copilot, which I have been using evaluating the use of throughout this repository. Copilot is trained from a vast amount of data across the internet so that it can give an informed response to any question that the user might pose. Although in other notebooks the reliability of this tool has been questioned, LLMs in general are mainly serving the purpose of aiding a human's task so it can be completed more efficiently.

In regard to this proposal, a LLM must be designed that can augment learning, enhance cultural engagement and/or inclusion of underrepresented groups to showcase the interaction between humans and AIs. 

With the knowledge of LLMs in mind, and the description of the project, I reasoned the design must include certain elements:

1. What is the purpose of the model.
2. What data the model will be trained from.
3. How it will use the data to mee the purpose of the model.
4. Who will use this model, and how it will assist them.

---

## Part 2 - The Application

### Description

My application will serve as a plug-in that can scan the user's screen on request and predict whether or not the content on the screen that the user requests to be scanned, being an image or text etc. is AI generated.

Going into more detail, the user downloads the application and, when running, the user can call up a search tool that and types in a description of what on their screen they want scanned. The following is an example of what the user's screen may look like and what they might type to the model:

[BBC NEWS, New H3N2 flu strain is circulating - so should you buy a vaccine this year?, 2025](https://www.bbc.co.uk/news/articles/crk7j8nxlr6o)

![Application-Example](ApplicationExample.png)

*"Scan the center image of my screen and check whether it is AI generated."*

The application will process the request and scan the screen for what is described by the user. It will then predict, based off its training, whether the image is AI generated and output something along the lines of:

*"The center image of your screen is likely not to be AI generated."*

If it is an AI generated image it would include a descriptions of the deductions it made to decide that was the case.

### LLM Training

In order to achieve this functionality the model must be trained and tested with a large amount of data. This will include:

1. AI Generated Content - This will mean the model will learn to spot specific repeated patterns of AI art/. For AI generaed art this could include colours, art style, shading etc. that appear regularly. For textual content, it could be common syntax, grammar, vocabulary etc.
2. Human Generated Content - This is for the reverse purpose. So that it has a thorough understanding of both types of content.
3. Human Interactions - It would need to have an understanding of how humans' communicate so that it can effectively comprehend and respond to questions asked by users.

The training model might take the form of Random Forest. This is a decision tree algorithm that predicts an outcome through true/false style probing of the input. An example decision tree model is below:

![DecisionTree](DecisionTree.jpg)
[DEVOPEDIA](https://devopedia.org/decision-trees-for-machine-learning)

### Intentions

The audience for this sort of applications is vast and it has many uses. However, to focus on one particular audience it would be academics. This can be said for students and lecturers/teachers. In terms of how it can augment learning:

1. When conducting research, it can be used to ensure that the resources gathered are in fact real and genuine material. It can be quite easy to be convinced that something is real on the internet even when it is not, so having a tool that can spot details in content that the human-eye might not be able to would be helpful when conducting research as it means no false information is being referenced.
2. Teachers can use the tool when marking assignments to ensure that the submitted content from students does not contain any AI generated content.
3. Users can learn from the description of markings of AI content so that in future they can learn to avoid it themselves.

Although this use may seem superfluous to some people, as markings of AI content might seem obvious, there is a portion of the online community that are not as media literate and cannot spot it as easily. Specifically, older generations of internet users can fall victim to this believing this content. Therefore, this design can not only augment learning, but represents older internet users as a tool to help them stay safe from false information on the internet.

---

## Part 3 - Reflection

Despite the positive applications this tool has there are some cultural and societal implications of a tool such as this.

Firstly, although it can be used to flag AI content, society may begin to rely on this tool and begin not to trust any content unless it has bee proven otherwise. This means that trust between creators and consumers could deteriorate. Building on this, human creators could feel pressured to ensure their art does not conform to any AI standards even if they had been using it prior just so there work will not have a chance of being flagged. This may block artists and creators to reaching their full creative limit.

There are also some privacy concerns with a tool such as this. Even though it will only begin scanning a user's screen on their request, it could be seen as intrusive to some users who would not want to have their screen captured. As well as this, there could be some issues in relation to Intellectual Property right. In order to be processed, the application would have to capture an image of the content, so, this could be breaching the Intellectual Property policy that means content cannot be replicated.

Overall, if this application is to move forward, there is likely some adjustments that need made to conform to societal and culural requirements. Society will not be fully comfortable with a screen-scanning tool, so potentially adjusting to only scanning inserted content could avoid this. As well as this steps to not stigmatise AI content must be made to avoid creators fearing being labelled as such through the flagging made by the application.