## AI and multimedia learning

Multimedia Learning Theory and Cognitive Load Theory are two foundational frameworks in the learning sciences that explain how people process information and how instruction can be designed to support learning. Both theories are often referenced when designing videos, online courses, and other multimedia learning materials, especially in digital learning environments.

Multimedia Learning Theory, most notably developed by Richard Mayer, argues that people learn better from words and pictures together than from words alone. The theory is based on three key assumptions: learners have separate visual and verbal processing channels, each channel has limited capacity, and meaningful learning requires active cognitive processing. According to this theory, effective multimedia instruction helps learners select relevant information, organize it into coherent mental structures, and integrate it with what they already know.

Cognitive Load Theory focuses more directly on the limitations of working memory. It distinguishes between intrinsic cognitive load, which comes from the complexity of the material itself, extraneous cognitive load, which comes from poor instructional design, and germane cognitive load, which supports learning and schema construction. From this perspective, good instructional design reduces unnecessary mental effort and helps learners focus their limited cognitive resources on understanding the content.

Together, these theories have shaped how educators think about instructional videos, visual aids, and digital learning tools. However, they were largely developed before the rise of modern AI systems that can dynamically interact with learners.

## AI and Multimedia Learning
With the rapid advancement of AI, especially large language models and multimodal systems, I started to wonder whether the way people learn from multimedia has fundamentally changed. Learners are no longer just watching videos or reading text. They can now ask questions, request summaries, generate visual representations, and interact with content in real time through AI systems.

With this question in mind, I conducted a small literature search and found two articles that stood out in helping me think through how AI intersects with multimedia learning theories.


This raises a deeper question: why are we often less tolerant of AI mistakes than human ones? If AI is a human-made system trained on historical data, its biases are ultimately reflections of human choices. Yet we frequently hold AI to a higher standard of consistency and accuracy. What, then, is a reasonable tolerance for AI error or hallucination, especially in high-stakes evaluative contexts?

These questions are not abstract for me. In projects like KidTalkMirror and other AI persona–based feedback tools, we actively use AI to evaluate human behavior and generate suggestions. Designing these systems forces us to confront the same dilemmas discussed in the Harvard article: whose definition of fairness is embedded in the model, which voices are excluded, and how easily those assumptions can drift over time. Rather than asking whether AI or humans are fairer, the more important task may be to continually interrogate what kind of fairness we are designing for, and who gets to define it.

![ALshaikh et al., 2024](ALshaikh2024.png)

The first article examines the design of an AI Educational Video Assistant that explicitly applies the Cognitive Theory of Multimedia Learning. The system integrates automatic speech recognition and a large language model to support learning from educational videos. Instead of treating AI as a purely technical tool, the authors intentionally align the system’s features with CTML principles such as signaling, segmenting, personalization, and guided discovery

For example, the tool transcribes videos and highlights key concepts to help learners focus on essential information. It also allows learners to ask questions about the video and receive targeted answers, which can reduce extraneous cognitive load by filtering out irrelevant content. In a reinforcement stage, learners can generate concept maps that visually organize information, supporting deeper understanding through generative processing.

What I found especially interesting is that this study suggests AI does not necessarily conflict with traditional learning theories. Instead, AI can help operationalize these principles more flexibly and at scale. However, the evaluation relies heavily on expert feedback rather than learner data, which raises questions about how students actually experience cognitive load and learning in practice.
Article 2: Multimodality of AI and Changing Learning Interactions

The second article takes a broader perspective by examining how AI systems are becoming increasingly multimodal, combining text, audio, images, and interaction into unified learning experiences. Rather than focusing on a single instructional tool, this article explores how AI systems themselves are evolving into interactive learning partners that can respond, adapt, and generate content across modalities.

From this perspective, learning with multimedia is no longer a one way process. Learners interact with AI through conversation, follow up questions, and iterative refinement. This challenges some traditional assumptions of both Multimedia Learning Theory and Cognitive Load Theory. For example, when learners rely on AI to summarize or explain content, some cognitive effort may be offloaded to the system. This makes it harder to clearly define what counts as extraneous versus germane cognitive load.

At the same time, multimodal AI systems may support learning by allowing students to control pacing, modality, and depth, which aligns with principles like learner control and personalization. The article raises important questions about whether AI changes how learning happens or simply changes how learning environments are structured.

## Takeaways
For researchers, these articles highlight the importance of grounding AI based learning tools in established learning theories. While AI introduces new interaction patterns, theories like Multimedia Learning and Cognitive Load Theory still provide valuable guidance. At the same time, researchers may need to refine these theories to better account for dynamic, conversational, and adaptive learning environments.
For teachers, the key takeaway is that AI can support effective multimedia learning when it is used intentionally. AI tools can help reduce unnecessary cognitive load by summarizing content, highlighting key ideas, and offering visual representations. However, teachers still play a critical role in deciding when struggle and effort are productive for learning and when support is helpful.
For students, AI powered multimedia tools offer more personalized and flexible ways to learn, but they also require self awareness and discipline. Having instant explanations and regenerated answers can support understanding, but it can also lead to surface level learning if students rely too heavily on the system.
Overall, these readings suggest that AI does not replace how people learn, but it does reshape the learning environment. As AI continues to evolve, learning sciences research will be essential in ensuring that multimedia learning remains meaningful, effective, and cognitively supportive rather than overwhelming or passive.

**References**:
> Lee, G., Shi, L., Latif, E., Gao, Y., Bewersdorff, A., Nyaaba, M., ... & Zhai, X. (2025). Multimodality of ai for education: Towards artificial general intelligence. IEEE Transactions on Learning Technologies.

> AlShaikh, R., Al-Malki, N., & Almasre, M. (2024). The implementation of the cognitive theory of multimedia learning in the design and evaluation of an AI educational video assistant utilizing large language models. Heliyon, 10(3).