<a href="https://colab.research.google.com/github/mikeogunmakin/research/blob/main/AI/AI_Engineering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AI Engineering

Link: https://www.amazon.co.uk/AI-Engineering-Building-Applications-Foundation/dp/1098166302/

## Introduction to Building AI Applications with Foundation Models

Modern AI after 2020 is defined primarily by scale. Models such as ChatGPT, Google’s Gemini, and Midjourney operate at unprecedented levels of size and computational intensity, consuming significant amounts of global electricity and pushing up against the limits of publicly available internet data for training. This scaling has two major consequences. The first is that **AI models have become dramatically more capable**, which has enabled a rapid expansion of applications that improve productivity, generate economic value, and enhance daily life. More people and teams can now use AI to accomplish increasingly complex tasks. The second consequence is **that training these large models demands enormous datasets, specialised expertise, and massive compute budgets—resources that only a small group of organisations can access**. As a result, a “model as a service” paradigm has emerged in which these organisations expose their models through APIs so others can build AI-powered applications without training models themselves.

This dynamic creates a landscape where **demand for AI applications continues to grow while the barriers to building them continue to fall**. This shift has turned AI engineering—the discipline of creating applications on top of ready-made machine learning models—into one of the fastest-growing areas in technology. Although AI applications existed long before large language models became dominant, powering systems such as fraud detection, churn prediction, and product recommendations, the new generation of large-scale models introduces both expanded opportunities and new challenges. Many traditional principles of deploying machine learning systems remain important, but LLMs bring issues such as hallucination, prompt dependency, evaluation complexity, and new forms of system design. The combination of accessible models and skyrocketing demand is reshaping how AI systems are built and deployed.


### The Rise of AI Engineering

#### From Language Modesl to Large Language Models

The rise of AI engineering is the result of decades of progress beginning with early language models in the 1950s and culminating in today’s foundation models. Although applications such as ChatGPT and GitHub Copilot appear to have emerged suddenly, they are built upon a long lineage of breakthroughs in statistical modelling, information theory, deep learning, and training methods. Modern large language models were made possible primarily through self-supervision, which allowed models to learn from massive amounts of raw text without relying on costly human-labelled datasets. Traditional language models captured the statistical structure of language by predicting how likely words or tokens were to appear in a given context. This statistical foundation can be traced back to early work by scholars such as Claude Shannon, whose ideas on entropy and prediction still influence language modelling today.

Language models operate on tokens rather than whole words or characters, allowing them to process language efficiently, handle rare or invented words, and represent meaning-rich units such as prefixes or suffixes. Two main types of language models exist: masked models, such as BERT, which learn to fill in missing tokens using surrounding context, and autoregressive models, such as GPT, which predict the next token in a sequence and can therefore generate free-form text. Autoregressive models became the engine of generative AI because framing tasks as text completion proved remarkably general. Translation, summarisation, classification, reasoning, and even coding can all be expressed as variations of “complete this text,” transforming a simple mechanism into a powerful universal interface for problem-solving.

The leap from ordinary language models to large language models was driven by self-supervision. In supervised learning, models depend on labelled datasets, which are expensive and time-consuming to produce at scale. Self-supervision removes this bottleneck by allowing models to generate their own training signals directly from raw text. Every sentence becomes multiple training examples, with the model learning to predict each successive token from the preceding context. Because unlabelled text is abundant—books, articles, forums, and online conversations—self-supervision enables training datasets of unprecedented size, making it possible for models to grow ever larger and more capable.

The concept of “large” is fluid, defined by the number of parameters a model contains. Early models such as GPT-1, with 117 million parameters, were considered large at the time, but this quickly changed as scaling laws demonstrated that increasing model size consistently improved performance. Models with tens or hundreds of billions of parameters are now common, and what counts as large continues to expand with each generation. Larger models require more data because they have greater capacity to learn; training them on small datasets would underutilise that capacity and waste computational resources.

Together, these developments—token-based modelling, generative completion as a universal task interface, and self-supervision enabling massive datasets—laid the groundwork for foundation models and the emergence of AI engineering. Modern AI engineering builds on these scalable, general-purpose models to create practical systems, applications, and workflows, marking a shift from training bespoke models to designing products and tools powered by existing, highly capable models.



> For GPT-4, an average token is approximately ¾ the length of a word. So, 100 tokens are approximately 75 words.


