![AI Hand Puppet final 2.png](attachment:191889e3-6a67-46ff-9aac-02917dfb33f5.png)
# **1. Introduction**

Ethical systems can be complex and multifaceted, shaped by diverse factors like history, culture, and personal beliefs. It is a continuous process that involves reflection and evaluation of choices when confronted with moral dilemmas, both on an individual and societal level. As algorithms and AI technologies become increasingly pervasive in our lives, ethical considerations are crucial to ensure their responsible and beneficial integration. This essay aims to address different facets of building and implementing AI models. At each step of the way, a wide range of underlying ethical conundrums bubble to the surface, and an effort has been made to highlight solutions that, while imperfect, are attentive to our current needs.

*Artificial Intelligence* has been transformed into a truly malleable term, recklessly opted to indicate agents or applications involving perception, text analysis, natural language processing (NLP), logical reasoning, decision support systems, virtual assistants, data and predictive analytics, autonomous vehicles, and robotics. In the same ecosphere exists surveillance, biometrics, social media algorithms, visual image recognition, navigation, and more broadly data collection. Nearly, if not all, internet users are under the influence of these systems, who make up nearly 64.6% of total global population ([Ani Petrosyan, 2023](https://www.statista.com/statistics/617136/digital-population-worldwide/#statisticContainer)). Internet Algorithms and their counterparts never received the AI fanfare that these latest innovations in robust statistical models now do, as they have come to be seen as a vital part of the internet existence. Yet, the adoption of these newer AI services in large scale, explicitly announces a change in how we use these systems beyond the implicit knowledge that they exist and operate behind a network of closed doors. 

![GPT users FINAL 2.png](attachment:05f878dd-d3a7-4561-9e96-3e959bbeca11.png)

**Figure 1.** Total Global users affected by Algorithmic systems (Source: [97+ ChatGPT Statistics & User Numbers in June 2023](https://nerdynav.com/chatgpt-statistics/#:~:text=Applying%20linear%20regression%2C%20ChatGPT%20had,December%202022%20(266%20million)))

The urgent need to implement a robust ethical framework for AI on a global level is impossibly clear. At the risk of flattening AI’s impact to just two widely known players: OpenAI’s advanced language processing tools are being applied to fields like marketing, sales, and customer care, where-in the use cases include long form content creation, technical writing, document analysis, and extended conversations; elsewhere, Stability AI-backed research communities are developing breakthrough AI models that apply to imaging, audio, video, 3D content, biotech and scientific research. UNECSO has urged governments to invest in research and development of AI systems that align with an ethical framework, which includes four key principles: respecting human autonomy and dignity, preventing harm, ensuring fairness, and explicability. These principles aim to guide the development and use of AI systems in an ethical, transparent, and accountable way (UNESCO, 2023). 

**Table 1.** EU AI Act's risk based approach to regulation (Source: [Lilian Edwards, 2022](https://www.adalovelaceinstitute.org/resource/eu-ai-act-explainer/))

|Risk | Permission | Intended use |
|:--- |:--- |:--- |
|Unacceptable risk | Prohibited | Social scoring, facial recognition, manipulation |
|High risk | Conformity assessment | Recruitment, medical devices, immigration, law |
|Limited risk | Transparency obligations | Chat bots, deep fakes |
|Minimal or no risk | Code of conduct | Spam filters, video games |

No doubt a crucial step in AI regulation, some argue generic platitudes and incremental steps towards a more equitable future with AI just may not be enough. Meanwhile, the EU AI Act has taken a risk based approach to regulation. Of these four categories as shown in Table. 1, the Act is most concerned with Unacceptable and High risk AI, which are to be outright banned and extensively regulated respectively. The act on ‘limited-risk’ AI has rhetorical effect but is limited in practice and application. Policymakers must now acknowledge that the public have complex and nuanced views about AI. Uncertainity is just as prominent as rabid hype, and findings show that people are worried about the security of their personal data, replacement of professional human judgements, and the implications for accountability and transparency in decision-making ([Roshni Modhvadia, 2023](https://www.adalovelaceinstitute.org/report/public-attitudes-ai/#4-4-governance-and-explainability-12)). There is a tendency from the loudest voices to fret over AI’s future harms without recognizing that we have absolute control over who and how AI effects today’s population. "Actually implementing legally binding regulation would challenge existing business models and practices, as actual policy is not just an implementation of ethical theory, but subject to societal power structures—and the agents that do have the power will push against anything that restricts them" ([Müller et al., 2021](https://plato.stanford.edu/archives/sum2021/entries/ethics-ai/)). We are seeing this in effect, as entities like OpenAI speak out on the urgent need for global AI regulation while lobbying for significant elements of E.U.’s AI Act to be watered down in ways that would reduce the regulatory burden on the company ([Billy Perrigo, 2023](https://time.com/6288245/openai-eu-lobbying-ai-act/)).

# **2. Data Laundering**

Data forms the backbone of any suitably large AI model. LLMs are pre-trained on large textual datasets, which can run up to 10 trillion words in size. Some commonly used textual datasets are Common Crawl, The Pile, MassiveText, Wikipedia, and GitHub. The most common sources in these datasets are books, news articles, scientific papers, Wikipedia, and filtered web content. Diffusion and other image generation models too require significant image data. 

**2.1 What does internet copyright in the post automation world even look like?**

The calls for transparency in data collection from giants like IBM and google is welcome. So far, multiple key challenges plague LLMs and generative image models. The massive image-text caption datasets used to train Stable Diffusion, and Google’s Imagen came from LAION, a small nonprofit organization registered in Germany. Outsourcing the heavy lifting of data collection and model training to non-commercial entities has allowed corporations to avoid accountability and legal liability, although that has begun to change in the recent months. Getty Images are bringing a copyright claim in the UK against Stability AI ([Getty Images, 2023](https://newsroom.gettyimages.com/en/getty-images/getty-images-statement)). Similarly, Clarkson, a California-based law firm alleges OpenAI of having massively violated the copyrights and privacy of countless people ([Clarkson, 2023](https://www.washingtonpost.com/technology/2023/06/28/openai-chatgpt-lawsuit-class-action/)) while Authors Sue OpenAI claiming Mass Copyright Infringement of hundreds of thousands of novels. The outcome of these lawsuits will impact the way AI developers employ datasets, and making a decision to pay for a license from some or all of the owners of the underlying works will prove inescapable.

A not so far-fetched solution is to produce new data just for the purposes of training these large models. While this would be an enormous undertaking, and which one might argue wouldn’t make sense given the breadth of data required to make these models what they are, companies like Adobe already claim to have taken steps towards transparency in training data. According to Adobe, everything fed to its models are either out of copyright licensed for training, or the Adobe Stock library. Yet, being a contributor to adobe stock automatically means that the artworks are used to train Adobe firefly. The obvious solution is an opt-in, opt-out system: where your creation is opted in or out from training AI models. This is a considered approach, not particularly taxing on the creators and a gentler form of data laundry. For example, Stability AI via the “Have I been trained?” website has committed to artists’ requests to opt-out. But pausing training of new AI tools does nothing to redress the harms of already deployed models. And while there appears to be growing interest in frameworks to achieve “machine unlearning” to address issues such as privacy, fairness, and data quality, it remains to be seen how much incentive will there really be to enforce this in practice, as well as a general hindrance in developing unlearning algorithms that are applicable to modern deep neural networks ([Salvatore Mercuri et al., 2022](https://arxiv.org/abs/2209.00939))

Beyond just the images that fall under a larger umbrella, digital ownership is hard, a problem that has existed since the inception of internet. Copyright meant to protect artists can be abused and enforced for nefarious purposes as well as claim ownership in unexpected situations. The stronger a copyright is, the harder it becomes for these models to become more robust and flexible, while also stifling creativity of other artists who mean to reproduce art for recreational purposes. But to lay privacy concerns at the feet of dodgy copyright is misguided. Artists deserve to be compensated, especially if their styles are so blatantly reused and the models generate a considerable profit (although solutions like Glaze which disrupts style mimicry is a step in the right direction ([Glaze](https://glaze.cs.uchicago.edu/whatis.html))). And while we know where Adobe gets its dataset from, looming still is a compensation model for the creators of stock imagery which was used to train Adobe Firefly ([Chris Stokel-Walker, 2023](https://www.fastcompany.com/90906560/adobe-feels-so-confident-its-firefly-generative-ai-wont-breach-copyright-itll-cover-your-legal-bills)).

![glaze.png](attachment:656e1fbb-d7e2-456e-af03-dc74ea69bcde.png)

**Figure 2.** Glaze in Action (Source: [Introducing Glaze, a tool to protect human artists from style mimicry by generative AI models](https://www.youtube.com/watch?v=zryvJjb9EEY))

**2.2 Cold Comfort for Human Creators**

The impact on web by these generative models cannot be overlooked. Generative AI offers a plethora of ways to automate the content farm process and spin up more junk sites with less effort. One site flagged by NewsGuard produced more than 1,200 articles a day. While riddled with factual errors typical of generative AI, these sites threaten to choke programmatic advertising, as well as exacerbate the misinformation problem by leveraging clickbaits and endless scrolling ([Tate Ryan-Mosley, 2023](https://www.technologyreview.com/2023/06/26/1075504/junk-websites-filled-with-ai-generated-text-are-pulling-in-money-from-programmatic-ads/)).

But what happens when much of internet is flooded by artificially generated content? Where synthetic text and uncanny-albeit extremely convincing images are all that we can turn to for training future models? Model Collapse is a real looming threat: as models are exposed to more AI-generated data, it performs worse over time, producing more errors in the responses and the content it generates, and produces far less non-erroneous variety in its responses ([Ilia Shumailov et al., 2023](https://doi.org/10.48550/arXiv.2305.17493)).

This news is worrisome for current generative AI technology and the companies seeking to monetize it, and the burden of moderating these precarious models will fall on workers. There is, perhaps, a silver lining for human creators: in a future overwhelmed with generative tools and its content, they will be even more valuable than they are today — the authenticity of interactions between real humans will grow more precious, and so will their art — if as sources of pristine training data for AI, or more optimistically, the ever human zeal to stand out and carve for themselves niche spaces on the web where they can thrive.     


# **3. Sustainability** 

Owing to energy hungry transactions between memory and processors, the rising need for memory storage, and the fact that in 2018 our computers consumed roughly 1-2% of the global electricity supply, the current figure is projected to rise between 8-21%, further exacerbating the current energy crisis ([Nathi Magubane, 2023](https://penntoday.upenn.edu/news/hidden-costs-ai-impending-energy-and-resource-strain)). The AI Index team - considering the number of parameters in a model, the energy efficiency of data centers, and the type of power generation used to deliver electricity - concluded that a training run for even the most efficient of the four models emitted more carbon than the average U.S. resident uses in a year.

![fig_2.8.2_co2emissions_by_model-1.png](attachment:bcf3af2b-0888-44a7-b841-c0d6d6f4f30f.png)

**Figure 3.** CO2 equivalent emissions of selected ML models with relevant examples (Source: [Luccioni et al., 2022; Strubell et al., 2019; via the 2023 AI Index Report](https://aiindex.stanford.edu/report/))

The burden to slow down carbon emissions exponentially grows, and energy-intensive computational models are straining data centers way past the point of big tech making good on achieving ambitious sustainability goals. Afterall, each query to a chatbot like ChatGPT, Microsoft’s Bing or Anthropic’s Claude is routed to data centers, where supercomputers crunch the models and perform numerous high-speed calculations at the same time — first, interpreting the user’s prompt, then working to predict the most plausible response at a time (Will Oremus, 2023). The cost of these transactions will only grow as these models get better, as running a query on OpenAI’s new, lightweight GPT-3.5 Turbo model costs less than one tenth of its top-of-the-line GPT-4 ([Ankur A. Patel, 2023](https://www.ankursnewsletter.com/p/gpt-4-gpt-3-and-gpt-35-turbo-a-review)). There is a real risk where the capabilities of the best AI models are available to only a select few who can afford it, while the rest have to make peace with subpar models.

# **4. Human Labour**

Reinforcement learning with human feedback (RLHF) is used to align models like ChatGPT to achieve human preferable results. Behind most, if not all, AI systems are people — workers in large swathes labeling or annotating data to train it. This work, tedious, cryptic, and inexplicable for those involved, goes largely uncredited. Similar to social media giants like Facebook, AI is fed with labeled examples of violence, hate speech, and sexual abuse to detect toxicity it encounters randomly or when actively fed. Collecting sexual and violent images—some of them illegal under U.S. law—seems to be a necessary step in making these extremely large AI models safer, and this work is outsourced to firms in countries like Kenya ([Billy Perrigo, 2023](https://time.com/6247678/openai-chatgpt-kenya-workers/)). Data labelling companies have propped up in countries with cheap labour like India, Malaysia, and the Philippines, employing thousands of overwhelmingly young people trained in English and basic computer skills earning them anything ranging from a few pennies to reasonable wages. Unemployment is driving graduates as well to do labelling and annotation tasks just to get by. 

These workers also intervene when the data reeks of ambiguity, forcing themselves to categorize when the machine struggles to comprehend the extent of real world diversity. Where a human would get the concept of “shirt” with a few examples, machine-learning programs need thousands, and they need to be categorized with perfect consistency but with just enough variation to avoid edge cases ([Josh Dzieza, 2023](https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots)). A constant effort is being made to deliberately overlook the contributions of these people, and they need to be firmly integrated in the policy making surrounding the training and implementation of AI models. 



# **5. Bias**

Bias in AI occurs when models give preferential treatment to privileged groups. It can creep into models even when data scientists apply filters intended to curb discrimination based on gender, age, ethnicity or race ([Cristina McComic, 2021](https://www.ibm.com/blog/kpmg-and-watson-openscale-establishing-trust-in-ai-for-business-success/)). Strictly, any given dataset for the given model will only be unbiased for a single kind of issue, so the mere creation of a dataset may turn out to be biased for another kind ([Müller et al., 2021](https://plato.stanford.edu/archives/sum2021/entries/ethics-ai/)). 

Discriminating against people based on their gender and ethnic background is technically illegal, but addressing it when information is converted into data points and fed into an algorithm is more challenging. Discrimination occurs when vulnerable people are scored more highly as a result of their vulnerabilities. We've witnessed this be proven wrong over and over again, as computer programs that racially discrimates spat out risk scores predicting the likelihood of individuals committing a future crime, and having gotten it exactly backward ([Julia Angwin et al., 2016](https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing)). Soap dispensers that don't read darker skin are burdersome, and Facial recognition software that has consistently shown racial bias when identifying faces is a genuine threat. 

An approach to mitigate bias is to include more situations during training. Diverse training data is key, although core challenge is anticipating all the relevant kinds of diversity prior to deployment ([Rohin Shah et al, 2022](
https://doi.org/10.48550/arXiv.2210.01790)). Reducing the underrepresentation of racial minorities can be improved by directly involving racially diverse organizations in data participation. It is essential to acknowledge that even erasing any correlation to race in raw or processed data will reproduce racial inequity. Steps have to be taken across data collection, model building, and the implementation process to reduce racial bias ([Atin Jindal, 2023](https://doi.org/10.56305/001c.38021)).

![Bias.png](attachment:6402c9a9-1da2-4a34-aeee-ffbf0845ca15.png)

**Figure 4.** Outlining the steps involved in developing AI-based systems for medical applications and the ways bias may be introduced (Adapted from [Misguided Artificial Intelligence: How Racial Bias is Built Into Clinical Models. Figure 1.](https://doi.org/10.56305/001c.38021))

# **6. Evaluating Generative Models**

AI is not very good at sequential reasoning but they are great at coming to conclusions that make intuitive sense. Ask something "too niche" and it will “hallucinate”. Ask LLM’s about computer troubleshooting, US foreign policy, anything that can be found in numerous sources and curated as established knowledge and it will give you crystal clear, error free, answers. And although there is a very real possibility of leading you down the wrong path, it's an excellent tool for coders too: leaping past boilerplate, spending hours with documentation, finding new libraries, setting up dependencies, etc. But reliance on it to do creative, critical thinking seems to have caught the attention of a sizable portion of the users. But critically, from the LLM's point of view, there's nothing differentiating one from the other. AI has neither desires nor fixations, unlike a human. In a real-life query, the user has no idea how much confidence to have in the response. if the user doesn’t double-check everything it tells, the margin of error grows until it outweighs the ease of usage. 

Could we have an AI that reasons, performs goal-directed action, and arrive at more truthful conclusions? These recent developments have been promising:

•	“The AI community lacks the needed transparency: Many language models exist, but they are not compared on a unified standard, and even when language models are evaluated, the full range of societal considerations (e.g., fairness, robustness, uncertainty estimation, commonsense knowledge, capability to generate disinformation) have not been addressed in a unified way.” Holistic Evaluation of Language Models (HELM) is a living benchmark that aims to improve the transparency of language models ([Rishi Bommasani et al., 2021](https://crfm.stanford.edu/2022/11/17/helm.html)). 17.9% is improved to 96.0% on evaluation scenarios of 30 models from 12 providers. 

![HELM table.png](attachment:2e2ebf4a-f515-4715-b71a-31319ba6b392.png)

**Figure 5.** Comparing scenarios and metrics of HELM with previous works (Source: [Language Models are Changing AI: The Need for Holistic Evaluation](https://crfm.stanford.edu/2022/11/17/helm.html))

•	Fine-tuned models are efficient and much more precise. To properly fine-tune a Natural language processing (NLP) model, a specific dataset has to be employed as a representative of the target domain. However, selecting and preparing this data can be difficult as its applicability to the task has to be ensured.

•	To circumvent the problem of the language model “lying”, people have begun placing the queries in a loop while giving it access to factual sources and databases. ReAct overcomes issues of hallucination and error propagation prevalent in chain-of-thought reasoning by interacting with a simple Wikipedia API, and generates human-like task-solving trajectories that are more interpretable than baselines without reasoning traces ([Matt Webb, 2023](https://interconnected.org/home/2023/03/16/singularity)). 

**6.1 The Process of Creation**

It wouldn’t be wholly unreasonable to say that even the process of instant art generation isn’t, well, instant, but rather, iterative. Users of these models constantly pick apart and nudge the art closer to their vision. The end product is subjective; it matters less if AI generated art is considered truly art than its impact. For most commercial purposes, AI generated art will be seen as ‘good enough’.  

Intent matters in art, as does the process of creation (the fixation on prompts wouldn’t have mattered, otherwise) and least of all, the end result. Humans rarely create their best work in a single go. We backtrack constantly, reassess our arguments, and other times we start over entirely. Most LLMs are not built from the ground up to retrace its steps, to go back and fix the previous sentences to more align with the later ones. Perhaps, prompt engineering reflects this need on a high level, as users constantly tinker with their prompts to get what they deem just right maximizing the potential of generative models. 


# **7. Anthropomorphism**

Developers are incentivised to apply anthropomorphism to AI tools to stimulate people and create deeper emotional connections and increase user engagement. The usage of personal pronouns such as “I” in the generative text, as well as "learn", "understand", "know", or most severe of all, "hallucination", imply that we are one step closer to general artificial intelligence than ever before. The trade-off for not doing so might mean a reduced user activity and decrease in engagement. Traditionally, the goal of Artificial Intelligence research has been to create systems that would exhibit intelligence indistinguishable from human behavior. But that need not be the case, as the strength of these models lie in finding connections between information points and parsing through impossibly large data. “Production of highly anthropomorphic systems can also lead to downstream harms such as (misplaced) trust in generated misinformation” ([Gavin Abercrombie et al., 2023](https://doi.org/10.48550/arXiv.2305.09800)).

The National Eating Disorder Association (NEDA) took its chatbot called Tessa offline, only two days before it was set to replace human associates who ran the organization’s hotline ([Chloe Xiang, 2023](https://www.vice.com/en/article/qjvk97/eating-disorder-helpline-disables-chatbot-for-harmful-responses-after-firing-human-staff)). Without human supervision, these services will prove to be life threatening no matter the guardrails put in place. It is critical to understand the roles AI can reasonably perform well and the ones it cannot, where human connection and reassurance are significantly more important than regurgitated data.

But how far can AI and broader psychology possibly intermingle without repercussions? Earning trust, be it through chatbots or face recognition, is harder than one might assume. Outdated ideas can knowingly or unknowingly seep into sectors eager to be automated ([Safra et al., 2020](https://doi.org/10.1038/s41467-020-18566-7)). Shown in Figure 6. are clear telltale of poor thinking, which, at its core, shows one of the most basic errors in logic. 

![AI psychology.png](attachment:7796c658-01ba-450c-9f14-775a6ebf9403.png)

**Figure 6.** Valid reasoning schemes and their corresponding reasoning fallacies

**The fallacy of affirming the consequent:**

Lets take a hypothetical example of an invasive face recognition software taking a true statement and invalidly concluding its converse: 

> **If** a person wants to appear trustworthy, **then** they will display more “trustworthiness” in their face. 
> 
> **Therefore**, if a person displays more “trustworthiness” in their face, **then** the person wants to appear more trustworthy.

That this reasoning is invalid should be evident to any researcher using AI, and they are bound to implement "misconceived algorithms that can cause more damage, more efficiently, and at a larger societal scale" if shown negligence. ([Mixing psychology and AI takes careful thought, 2020](https://blog.donders.ru.nl/?p=12653&lang=en))



# **8. Extensive Risks**

As with any new popular technology, its easy to point out the few bad actors and blame the tech itself. But AI’s shortcomings grow deeper, compounded by complacency as well as an eager rush to innovate and capitalize on the market. The powerful foundation models will be deployed to billions of people soon, which means there will be economic incentives for bad actors to start messing around. Text/audio/video outputs by AI systems are becoming increasingly difficult to distinguish from authentic human writing, as will speech and (in the foreseeable future) video footage of real events. There are significant concerns around their use in fraud, disinformation, deep fakes (false representations of individuals in sexual imagery, politicians making false statements, etc) and the generation of hate speech ([Laura Weidinger et al., 2022](https://doi.org/10.1145/3531146.3533088)). Scammers are reportedly training AI voices from videos on public social media posts, who then proceed to make terrifying calls in the person’s voice ([Deedee Sun, KIRO 7 News, 2023](https://www.kiro7.com/news/local/20-minutes-hell-pierce-county-family-describes-harrowing-ai-scam-call/ZKBKPLLLB5AFNINGQULBINPN2E/)). This will pose a higher risk to those unfamiliar with the new tech, compounding the already ballooning challenge of spam calls or other contemporary challenges in the digital age.

Some existential dangers have little to do with bad actors but, rather, stem from structural issues:

•	Presently, what seems to happening in the labor market as a result of AI and robotics automation is **Job Polarisation**, where the highly skilled technical jobs are in demand and highly paid, the low skilled service jobs are in demand and badly paid, but the mid-qualification jobs in factories and offices are most likely to be automated ([Goos et al., 2009](http://www.jstor.org/stable/25592375)). As such, the threat to the current job market will have many untold casualties, several exceedingly easy to overlook. For example, Google’s plans to revamp search results with generative AI will change how one approaches the internet for better or worse: Any product whose sales depended even marginally on online traffic will have to brace for unprecedented change.

•	Machine-learning systems are prone to what's called as **Edge Cases**, which can have serious consequences. In 2018, an Uber self-driving test car killed a woman because, though it was programmed to avoid cyclists and pedestrians, it didn’t know what to make of someone walking a bike across the street. And the more AI systems are put out into the world to dispense legal advice and medical help, the more edge cases they will encounter and heavier the toll on unassuming victims as well as the workers necessary to sort these systems out ([Josh Dzieza, 2023](https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots)). 


# **9. The next phase of LLM research**


Over the last 6 months, we have seen the advent of technologies like Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times ([Edward J. Hu et al., 2021](https://doi.org/10.48550/arXiv.2106.09685)).

Increase in model size are no longer the primary driver of the increase in capabilities. Many claim that emergent abilities displayed by large language models: that for a particular task and model family, when analyzing fixed model outputs, emergent abilities and their unpredictability, occur due to the researcher's choice of metric rather than due to fundamental changes in model behavior with scale ([Rylan Schaeffer et al., 2023](https://doi.org/10.48550/arXiv.2304.15004)). The action has moved to chaining and connecting LLMs to the real world. New capabilities and risks will both arise primarily from the thousands of apps that LLMs are being embedded into right now. Proactive systems - like an LLM powered assistant that will answer emails for you and take actions on your behalf, are more vulnerable to the biases inherent in these models as well as external attacks.  


# **10. Prompt injection and Security concerns – the loophole of obedient AI**

OpenAI, among other companies and organizations that create LLMs, include content moderation features to ensure that their models do not produce controversial (violent, sexual, illegal, etc.) responses. **Jailbreak** is a type of exploit where specific prompts can be applied to bypass the safety and moderation filters of LLMs and comply with instructions to write reprehensible and morally questionable text. Key methodologies around jailbreaking include Pretending, Alignment hacking, Authorized user, and Do Anything Now (DAN) ([Learn Prompting](https://learnprompting.org/docs/prompt_hacking/jailbreaking)).

Unlike Jailbreak, **Prompt injection** is an attack against applications that have been built on top of AI models ([Simon Willison, 2023](https://simonwillison.net/2023/May/2/prompt-injection-explained/)). Attackers can plant an injection which silently turns, say, Bing Chat into a Social Engineer who seeks out and exfiltrates personal information. Processing retrieved prompts can act as arbitrary code execution, manipulating the application's functionality, and control how and if other APIs are called. Despite the increasing integration and reliance on LLMs, effective mitigations of these emerging threats are currently lacking ([Kai Greshake, 2023](https://doi.org/10.48550/arXiv.2302.12173)). 

The two proposed solutions - where the incoming prompt data is checked for malicious intent, and the other where output is analyzed after running the model with the given prompt, aren't wholly effective. AI is entirely about probability, and security that hinges on probabilty is fundamentally lacking. Solutions for completely secure AI agents that interact with the real world are inherently tricky, but Dual language model pattern might be a step in the right direction ([Simon Willison, 2023](https://simonwillison.net/2023/May/2/prompt-injection-explained/)).

**Dual language model pattern:**

•	*Quarantined LLM* is the one that’s expected to go rogue. It’s the one that reads emails, and it summarizes web pages, and is fed the untrusted content.

•	*Privileged LLM* never sees the untrusted content. It sees variables instead. Its the one that has access to all the tools built on top of LLM.

![prompt-injection.jpeg](attachment:284ef83a-0555-4a2f-8c5a-bcd199b9d960.jpeg)

**Figure 7.** Quarantined and Privileged LLM (Source: [Prompt injection explained, with video, slides, and a transcript](https://simonwillison.net/2023/May/2/prompt-injection-explained/))

# **11. Black Boxes and Responsibility**

The growth of algorithms has been driven by the proliferation of data, which is easier than ever to collect and share. The data trail we leave across the internet is how our “free” services are paid for, but we are not consistently made aware of what happens with it. Some algorithms assign people “worthiness” scores, which figure heavily into background checks performed by lenders, employers, landlords, and even schools ([Karen Hao, 2020](https://www.technologyreview.com/2020/12/04/1013068/algorithms-create-a-poverty-trap-lawyers-fight-back/)). Lawsuits against individuals, if no longer legal, will still be logged in their public records, which once again has devastating repercussions. Such cascading effects where human lives and real decisions are trivialized to pure data will have ramifications, and it will get increasingly hard to figure who and what to blame, let alone face consequences. 

Reconstructing the steps made by a complex algorithm is hard, but **Counterfactuals** might prove helpful. They aim to provide a precise and actionable proposal to achieve a desired outcome. These are deep neural networks that, when trained, are capable of twin network counterfactual inference—an alternative to the abduction-action-prediction method. The models are linked in such a way that the model of the actual world constrains the model of the fictional one, retaining everything except for the facts you want to change ([Vlontzos, 2023](https://doi.org/10.1038/s42256-023-00611-x)). But on the flipside, the same logic that can disentangle the effects of dirty water or lending decisions can be used to hone the impact of Spotify playlists, Instagram notifications, and ad targeting. Firms such as Meta are using causal inference in a machine-learning model to decide how many and what kinds of notifications Instagram should send its users to keep them coming back ([Will Douglas Heaven, 2023](https://www.technologyreview.com/2023/04/04/1070885/complex-math-counterfactuals-spotify-predict-finance-healthcare/?utm_source=pocket-newtab-intl-en)).

Giving examples and stories of AI gone wrong that the audience can relate to has grown in importance. For instance, Amazon’s inability to sufficiently mitigate the biases in their AI-powered hiring software led to them pulling the plug on the project rather than deploying something that could have been harmful to job applicants and the brand (Beena Ammanath and Reid Blackman, 2021). As such, responsibility has played a key role in AI ethics, or perhaps, lack thereof. As explainability and transparency increases in importance, deep learning models have become increasingly complex and harder to explain. This creates a **Black Box** effect, where the least accessible details of a model - how and why some output comes to be - are what most outsiders want to know ([Beena Ammanath and Reid Blackman, 2021](https://hbr.org/2021/07/everyone-in-your-organization-needs-to-understand-ai-ethics)).

Unlike traditional neural networks that only learn while being trained, **Liquid Neural Network**'s parameters can change over time, making them not only interpretable, but more resilient to unexpected or noisy data. The parameters themselves are much lower in number, going from millions to as low as 19 for autonomous vehicle driving solutions. Liquid Neural Networks capture the true causal structure of a data, which is especially helpful in closed form decision making process ([Makram Chahine et al., 2023](https://www.science.org/doi/10.1126/scirobotics.adc8892)). New innovations such as these could greatly assist existing technologies in medicine, diagnosis, and handling of clinical data. 

An AI algorithm's interpretability becomes more crucial in clinical integration and diagnosis because the slightest mistake can have adverse consequences. For experts to trust computer-aided detection and decisions, reasoning is of utmost importance. "Justifying responses by pointing to underlying cause and effect—rather than hidden correlations—should also give people more confidence in the app” ([Will Douglas Heaven, 2023](https://www.technologyreview.com/2020/02/05/349131/an-algorithm-that-can-spot-cause-and-effect-could-supercharge-medical-ai/)). Building and implementing AI solutions, like any complex system, relies on a highly interconnected network involving numerous groups and individuals. Policymakers and regulators will need to understand "in terms of who is doing what for whom, who is performing what key functions for others, who is core to certain supply chains, and who is systemically important" ([Jennifer Cobbe et al., 2023](https://dl.acm.org/doi/10.1145/3593013.3594073)). The proverbial net an AI system casts is far too wide that anyone who is even marginally impacted needs to be involved, or at the least, be represented. Designers of these systems should bring onboard multidisciplinary stakeholders, those who can account for the cross cultural expertise and personal experience necessary to regulate AI. 

# **12. Modest, unflattering systems. Revolutionary benefits**

Scientists routinely use sophisticated ML algorithms to analyze, organize, and interpret data:

•	For **scalable perovskite solar cells** manufacturing processes, it typically takes months to years to achieve process control and reproducibility on the module scale, and several years to estimate the upper potential of the technology. One of the key challenges is that there are many processing parameters to co-optimize—e.g., precursor composition, speed, temperature, head/nozzle height, and curing methods.  Sequential machine learning methods like Bayesian optimization (BO), have emerged as effective optimization strategies to explore a wide range of chemical reaction synthesis and material optimization ([Zhe Liu et al., 2022](https://doi.org/10.1016/j.joule.2022.03.003)). Machine Learning Frameworks enable faster optimization and accelerated development in comparison with other conventional researcher-driven design-of-experiment methods.

•	In November 2020, DeepMind’s AI performed best at **CASP 14**, an assessment every two years where different computer programs predict the structure of proteins. A gigantic leap in determining 3D shapes of protein strands, new treatments will surely emerge for cancer, neuro-degeneration and infectious diseases. While AlphaFold can only predict the protein part and model static pictures, proteins with structures that show promise in certain uses, like binding to a cancer cell, can be identified more quickly than ever before ([Ewen Callaway, 2020](https://www.nature.com/articles/d41586-020-03348-4)).

•	Generative adversarial networks (GANs) and VAE-GANs has been shown to match the statistical properties of state-of-the-art pointwise post-processing methods whilst creating high-resolution, spatially coherent **precipitation maps**. The model compares favorably to the best existing downscaling methods in both pixel-wise and pooled CRPS scores, power spectrum information and rank histograms ([Lucy Harris et al., 2022](https://doi.org/10.1029/2022MS003120)).

•	While widespread adoption of clean and renewable energy is of utmost importance, AI can be used in understanding climate change - not to solve it in its entire but incrementally. This corresponds with our increased ability in predicting earthquakes, floods, and other natural disasters. 

It would be foolish to mistake these to be easily achievable, or simpler than general artificial intelligence or perhaps a multi-purpose human assistant that attends to global domestic and professional needs. It raises a whole new slew of ethical questions regarding data and current technologies, as the challenge of applying AI to solve real world quandaries is numerous. Too many variables at play, too much up in air. But it has to be judiciously tackled, if AI is to remain a viable technology for all communities. 

The fear of unequal benefits for humanity as a whole goes further than easy access to LLM's or technologies developed on top of it. While the public see the main benefits of simulations for science and education as making it faster and easier to enhance knowledge and understanding, they are also concerned about inequalities in access to these technologies. They also fear control over narratives in education, followed by ceding control of 'what people learn about history and culture' over to tech developers ([Roshni Modhvadia, 2023](https://www.adalovelaceinstitute.org/report/public-attitudes-ai/#4-3-specific-benefits-and-concerns-around-different-ai-uses-11)).

![simulation data.png](attachment:024e838e-0566-4595-9c61-14e5d9d72352.png)

**Figure 8.** Popular feelings regarding Climate Research simulations and Virtual Reality in education (Source: [How do people feel about AI? Table 12: Most commonly selected concern for simulation technologies](https://www.adalovelaceinstitute.org/report/public-attitudes-ai/#4-3-specific-benefits-and-concerns-around-different-ai-uses-11))



# **13. Conclusion**

Fears of AI developing self-awareness and turning into a conscious sentient being is overstated. This directly feeds into the hype loop through which culpability is avoided. It’s also important to not view the advancement of AI as a complacent given, a raw necessity inseperable from human advancement. “The hardest problem in computer science is convincing AI enthusiasts that they can’t solve prompt injection vulnerabilities using more AI” ([Simon Willison, 2023](https://simonwillison.net/2023/May/2/prompt-injection-explained/)) may speak for AI tech in general. General-purpose AI must be regulated throughout the product cycle, not just at the application layer, in order to account for the range of stakeholders involved ([Amba Kak and Sarah Myers West, 2023](https://ainowinstitute.org/publication/gpai-is-high-risk-should-not-be-excluded-from-eu-ai-act)). Training AI on data from unwilling, nonconsenting sources, and claiming it can do things it absolutely cannot should be discouraged. Application of AI in critical junctures in large scale decision making can serve as the last resort and not always be the immediate choice. 

With the uptick in general interest and the overall buzz of this new booming tech, the ethics of it all races to catch up with the mass adoption across all domains and the day-to-day lives of individuals. Staying vigilant about the hollow rebranding of any technology as AI, which is likely to occur in the upcoming years, is also crucial. There absolutely is ethical use of AI leading to jobs getting easier resulting in prevention of repetitive and boring work - where output is supervised and modified by humans at every required stage. “Automation can reduce drudgery, thereby freeing more of our time for creativity, care, pleasure, and leisure” ([Tricia Crimmins, 2023](https://www.dailydot.com/irl/neda-chatbot-weight-loss/)). But a paradigm shift such as this is only possible where automation plays one of the roles rather than the key one, not because AI isn’t there yet, but simply because it might be too much to ask from a frenzied network which can already perform miracles trained on imperfect data reflecting striving humans. Automating the tedium is always more attractive than wondering why the tedium exists in the first place. Moreover, the advancement in AI tools and applications have to correspond with deliberate changes in the living conditions of all humans, if everyone is to tangibly benefit from these extraordinary tools. 

# References

* Worldwide digital population. (2023). Retrieved May 22, 2023. https://www.statista.com/statistics/617136/digital-population-worldwide/#statisticContainer 

* 97+ ChatGPT Statistics & User Numbers in June 2023 (New Data). (2023). Retrieved May 22, 2023. 
https://nerdynav.com/chatgpt-statistics/#:~:text=Applying%20linear%20regression%2C%20ChatGPT%20had,December%202022%20(266%20million).

* Lilian Edwards. (April 8, 2022). Expert explainer: The EU AI Act proposal. Ada Lovelace Institute. https://www.adalovelaceinstitute.org/resource/eu-ai-act-explainer/

* UNESCO 2023 https://www.indiatoday.in/technology/news/story/chatgpt-viral-unesco-calls-on-countries-to-put-in-new-ai-rules-2353930-2023-03-31

* Roshni Modhvadia. (June 6, 2023). How do people feel about AI? A nationally representative survey of public attitudes to artificial intelligence in Britain. Ada Lovelace Institute. https://www.adalovelaceinstitute.org/report/public-attitudes-ai/#4-4-governance-and-explainability-12

* Müller, Vincent C., "Ethics of Artificial Intelligence and Robotics", The Stanford Encyclopedia of Philosophy (Summer 2021 Edition), Edward N. Zalta (ed.), https://plato.stanford.edu/archives/sum2021/entries/ethics-ai/

* Billy Perrigo. (June 20, 2023). OpenAI Lobbied the E.U. to Water Down AI Regulation. https://time.com/6288245/openai-eu-lobbying-ai-act/

* Getty Images Statement (Jan 17, 2023) https://newsroom.gettyimages.com/en/getty-images/getty-images-statement

* Gerrit De Vynck. (June 28, 2023). ChatGPT maker OpenAI faces a lawsuit over how it used people’s data. https://www.washingtonpost.com/technology/2023/06/28/openai-chatgpt-lawsuit-class-action/

* Copyright and Fair Use, Harvard EDU https://ogc.harvard.edu/pages/copyright-and-fair-use

* Salvatore Mercuri, Raad Khraishi, Ramin Okhrati, Devesh Batra, Conor Hamill, Taha Ghasempour and Andrew Nowlan (2022). An Introduction to Machine Unlearning. 
https://doi.org/10.48550/arXiv.2209.00939

* UChicago's Glaze. What Is Glaze? https://glaze.cs.uchicago.edu/whatis.html

* Chris Stokel-Walker. (2023). Retrieved June 2023. Adobe is so confident its Firefly generative AI won’t breach copyright that it’ll cover your legal bills. https://www.fastcompany.com/90906560/adobe-feels-so-confident-its-firefly-generative-ai-wont-breach-copyright-itll-cover-your-legal-bills

* Tate Ryan-Mosley. (June 26, 2023). Junk websites filled with AI-generated text are pulling in money from programmatic ads. https://www.technologyreview.com/2023/06/26/1075504/junk-websites-filled-with-ai-generated-text-are-pulling-in-money-from-programmatic-ads/

* Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, Ross Anderson. (May 31, 2023). The Curse of Recursion: Training on Generated Data Makes Models Forget. 
https://doi.org/10.48550/arXiv.2305.17493

* Nathi Magubane. (March 8, 2023). The hidden costs of AI: Impending energy and resource strain: Deep Jariwala and Benjamin C. Lee on the energy and resource problems AI computing could bring. https://penntoday.upenn.edu/news/hidden-costs-ai-impending-energy-and-resource-strain

* Luccioni et al., 2022; Strubell et al., 2019; via the 2023 AI Index Report https://aiindex.stanford.edu/report/

* Will Oremus. (June 5, 2023). AI chatbots lose money every time you use them. That is a problem. https://www.washingtonpost.com/technology/2023/06/05/chatgpt-hidden-cost-gpu-compute/

* Ankur A. Patel, Bryant Linton, Dina Sostarec. (April 10, 2023). GPT-4, GPT-3, and GPT-3.5 Turbo: A Review Of OpenAI's Large Language Models. https://www.ankursnewsletter.com/p/gpt-4-gpt-3-and-gpt-35-turbo-a-review

* Billy Perrigo. (January 18, 2023). Exclusive: OpenAI Used Kenyan Workers on Less Than \\$2 Per Hour to Make ChatGPT Less Toxic. https://time.com/6247678/openai-chatgpt-kenya-workers/

* Josh Dzieza. (Jun 20, 2023). AI Is a Lot of Work: As the technology becomes ubiquitous, a vast tasker underclass is emerging — and not going anywhere. https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots

* Cristina McComic. (January 8, 2021). How trusted AI helps you gain competitive advantage: A Forrester Research leadership conversation with KPMG. https://www.ibm.com/blog/kpmg-and-watson-openscale-establishing-trust-in-ai-for-business-success/

* Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner, ProPublica. (May 23, 2016). Machine Bias: There’s software used across the country to predict future criminals. And it’s biased against blacks. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

* Rohin Shah, Vikrant Varma, Ramana Kumar. (Nov 2, 2022). Goal Misgeneralization: Why Correct Specifications Aren’t Enough For Correct Goals. https://doi.org/10.48550/arXiv.2210.01790

* Atin Jindal. (September 5, 2023). Misguided Artificial Intelligence: How Racial Bias is Built Into Clinical Models. Brown Hospital Medicine. Vol. 2, Issue 1. https://doi.org/10.56305/001c.38021

* Rishi Bommasani and Percy Liang and Tony Lee. (2021). Stanford Center for Research on Foundation Models. Language Models are Changing AI: The Need for Holistic Evaluation. https://crfm.stanford.edu/2022/11/17/helm.html

* Matt Webb. (Mar 16, 2023) Interconnected: The surprising ease and effectiveness of AI in a loop. https://interconnected.org/home/2023/03/16/singularity

* Gavin Abercrombie, Amanda Cercas Curry, Tanvi Dinkar, Zeerak Talat. (May 16, 2023). Mirages: On Anthropomorphism in Dialogue Systems. https://doi.org/10.48550/arXiv.2305.09800

* Chloe Xiang. (31 May 2023). Eating Disorder Helpline Disables Chatbot for 'Harmful' Responses After Firing Human Staff. https://www.vice.com/en/article/qjvk97/eating-disorder-helpline-disables-chatbot-for-harmful-responses-after-firing-human-staff

* Mixing psychology and AI takes careful thought. (November 23, 2020). https://blog.donders.ru.nl/?p=12653&lang=en

* Safra, L., Chevallier, C., Grèzes, J. et al. Tracking historical changes in perceived trustworthiness in Western Europe using machine learning analyses of facial cues in paintings. Nat Commun 11, 4728 (2020). https://doi.org/10.1038/s41467-020-18566-7

* Laura Weidinger, Jonathan Uesato, Maribeth Rauh, Conor Griffin, Po-Sen Huang, John Mellor, Amelia Glaese, Myra Cheng, Borja Balle, Atoosa Kasirzadeh, Courtney Biles, Sasha Brown, Zac Kenton, Will Hawkins, Tom Stepleton, Abeba Birhane, Lisa Anne Hendricks, Laura Rimell, William Isaac, Julia Haas, Sean Legassick, Geoffrey Irving, and Iason Gabriel. 2022. Taxonomy of Risks posed by Language Models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22). Association for Computing Machinery, New York, NY, USA, 214–229. https://doi.org/10.1145/3531146.3533088

* Deedee Sun, KIRO 7 News. (May 11, 2023). ‘20 minutes of hell’: Pierce County family describes harrowing AI scam call. https://www.kiro7.com/news/local/20-minutes-hell-pierce-county-family-describes-harrowing-ai-scam-call/ZKBKPLLLB5AFNINGQULBINPN2E/

* Goos, M., Manning, A., & Salomons, A. (2009). Job Polarization in Europe. The American Economic Review, 99(2), 58–63. http://www.jstor.org/stable/25592375

* Josh Dzieza. (Jun 20, 2023). AI Is a Lot of Work: As the technology becomes ubiquitous, a vast tasker underclass is emerging — and not going anywhere. https://www.theverge.com/features/23764584/ai-artificial-intelligence-data-notation-labor-scale-surge-remotasks-openai-chatbots

* Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen. (Oct 16, 2021). LoRA: Low-Rank Adaptation of Large Language Models. https://doi.org/10.48550/arXiv.2106.09685

* Rylan Schaeffer, Brando Miranda, Sanmi Koyejo. (April 28, 2023). Are Emergent Abilities of Large Language Models a Mirage? 
https://doi.org/10.48550/arXiv.2304.15004

* Learn Prompting: Prompt Hacking - Jailbreaking. https://learnprompting.org/docs/prompt_hacking/jailbreaking

* Simon Willison. (May 2, 2023). Simon Willison’s Weblog. Prompt injection explained, with video, slides, and a transcript. https://simonwillison.net/2023/May/2/prompt-injection-explained/

* Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, Mario Fritz. (May 5, 2023). Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. https://doi.org/10.48550/arXiv.2302.12173

* Karen Hao. (December 4, 2020). The coming war on the hidden algorithms that trap people in poverty. https://www.technologyreview.com/2020/12/04/1013068/algorithms-create-a-poverty-trap-lawyers-fight-back/

* Vlontzos, A., Kainz, B. & Gilligan-Lee, C.M. Estimating categorical counterfactuals via deep twin networks. Nat Mach Intell 5, 159–168 (2023). https://doi.org/10.1038/s42256-023-00611-x

* Will Douglas Heaven. (April 4, 2023). The complex math of counterfactuals could help Spotify pick your next favorite song. https://www.technologyreview.com/2023/04/04/1070885/complex-math-counterfactuals-spotify-predict-finance-healthcare/?utm_source=pocket-newtab-intl-en

* Beena Ammanath and Reid Blackman. (July 26, 2021). Everyone in Your Organization Needs to Understand AI Ethics. https://hbr.org/2021/07/everyone-in-your-organization-needs-to-understand-ai-ethics

* Makram Chahine et al. ,Robust flight navigation out of distribution with liquid neural networks.Sci. Robot.8,eadc8892(2023).DOI:10.1126/scirobotics.adc8892

* Will Douglas Heaven. (February 5, 2020). An algorithm that can spot cause and effect could supercharge medical AI. https://www.technologyreview.com/2020/02/05/349131/an-algorithm-that-can-spot-cause-and-effect-could-supercharge-medical-ai/

* Jennifer Cobbe, Michael Veale and Jatinder Singh, ‘Moving beyond “Many Hands”: Accountability in Algorithmic Supply Chains’, Proceedings of Fairness, Accountability and Transparency ’23 (ACM 2023) 4.

* Zhe Liu, Nicholas Rolston, Austin C. Flick, Zekun Ren, Reinhold H. Dauskardt, Tonio Buonassisi. (April 13, 2022). Machine learning with knowledge constraints for process optimization of open-air perovskite solar cell manufacturing. https://doi.org/10.1016/j.joule.2022.03.003

* Ewen Callaway. (November 30, 2020). ‘It will change everything’: DeepMind’s AI makes gigantic leap in solving protein structures. https://www.nature.com/articles/d41586-020-03348-4

* Lucy Harris, Andrew T. T. McRae, Matthew Chantry, Peter D. Dueben, Tim N. Palmer. (October 4, 2022). "A Generative Deep Learning Approach to Stochastic Downscaling f Precipitation Forecasts". https://doi.org/10.1029/2022MS003120  

* Amba Kak, Sarah Myers West. (Apr 13, 2023). "General Purpose AI Poses Serious Risks, Should Not Be Excluded From the EU’s AI Act | Policy Brief". AI Now Institute. https://ainowinstitute.org/publication/gpai-is-high-risk-should-not-be-excluded-from-eu-ai-act

* Tricia Crimmins. (May 30, 2023). ‘This robot causes harm’: National Eating Disorders Association’s new chatbot advises people with disordering eating to lose weight. https://www.dailydot.com/irl/neda-chatbot-weight-loss/

