# Deploying AI
## Assignment 1: Evaluating Summaries

A key application of LLMs is to summarize documents. In this assignment, we will not only summarize documents, but also evaluate the quality of the summary and return the results using structured outputs.

**Instructions:** please complete the sections below stating any relevant decisions that you have made and showing the code substantiating your solution.

## Select a Document

Please select one out of the following articles:

+ [Managing Oneself, by Peter Druker](https://www.thecompleteleader.org/sites/default/files/imce/Managing%20Oneself_Drucker_HBR.pdf)  (PDF)
+ [The GenAI Divide: State of AI in Business 2025](https://www.artificialintelligence-news.com/wp-content/uploads/2025/08/ai_report_2025.pdf) (PDF)
+ [What is Noise?, by Alex Ross](https://www.newyorker.com/magazine/2024/04/22/what-is-noise) (Web)

I selected "What is Noise by Alex Ross. I will introduce here my summary of the text's main arguments. so I can evaluate the summary produced by my model: 


* Information theorists transform the concept of noise from its acoustic traditional meaning and apply it to "any ambient activity that hinders signals". 
* Using his personal experiences with noise, the author introduces the relationships between noise and control: something will be judged as noise/not noise whether one elects or is being forced to hear/listen to it. Therefore, noise is a political and ethical issue. The author hence defines "unwanted sound".
* The author describes the relation between noise and racism (e.g., black hip-hop in the USA and Jewish synagogues were interpreted as noisy), and the relation between noise and colonialism - "noise enables power [...] is a way to say: the world is mine". There is also a class aspect to noise: in urban life, silence is the privilege of the rich, and noise is an index of struggle. 
* The Industrial Revolution foregrounded the theory and practice of noise control. Indeed, there has been a certain levelling off of noise in recent years, related to the rise of digital technology, which has created a new type of noise: informational noise. However, the informational noise is not a new phenomenon. Professor Chen-Pang Yeang, in his book, reviewed the history of electronic noise in the history of information and communication technologies, and the attempts of scientists, technologists, and engineers to reduce the ability of these sounds to hinder the transmission of messages. 
* The author introduces the concept of stochastic noise (inherent, random fluctuations in physical, biological, or digital systems that cause variability despite identical conditions). It often arises from the particle nature of matter or energy, such as photons, electrons, or molecular collisions. This concept was applied to understand/theorize unpredictable decision-making and stock market behaviour, among other social science phenomena. 
* The algorithm was presented by psychologist and game theorist Daniel Kahneman as the noise-free solution, although ML relies on stochastic processes
* The author then moves to define White noise as "the sound field in which all frequencies are equally intense" (e.g., the "snow" on the television screen during broadcasting break hours)
* The author deploys the interactions of noise with fascism on the one hand and counter-culture/resistance culture on the other hand. 

# Load Secrets

In [24]:
%reload_ext dotenv
%dotenv ../05_src/.secrets


## Load Document

Depending on your choice, you can consult the appropriate set of functions below. Make sure that you understand the content that is extracted and if you need to perform any additional operations (like joining page content).

### PDF

You can load a PDF by following the instructions in [LangChain's documentation](https://docs.langchain.com/oss/python/langchain/knowledge-base#loading-documents). Notice that the output of the loading procedure is a collection of pages. You can join the pages by using the code below.

```python
document_text = ""
for page in docs:
    document_text += page.page_content + "\n"
```

### Web

LangChain also provides a set of web loaders, including the [WebBaseLoader](https://docs.langchain.com/oss/python/integrations/document_loaders/web_base). You can use this function to load web pages.

In [25]:
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://www.newyorker.com/magazine/2024/04/22/what-is-noise/")
article = loader.load()[0].page_content

In [26]:
article



In [27]:
# I will use the textwrap library for a more acceccible introduction of the text 

In [28]:
import textwrap

wrapped_text = textwrap.fill(article, width=140)

print (wrapped_text)

What Is Noise? | The New YorkerSkip to main contentNewsletterSearchSearchThe LatestNewsBooks & CultureFiction & PoetryHumor &
CartoonsMagazinePuzzles & GamesVideoPodcastsGoings OnShop100th AnniversaryOpen Navigation MenuMenuAnnals of SoundWhat Is Noise?Sometimes we
embrace it, sometimes we hate it—and everything depends on who is making it.By Alex RossApril 15, 2024Noise has come to mean an engulfing
barrage of data—less an event than a condition.Illustration by Petra PéterffySave this storySave this storySave this storySave this
story“Noise” is a fuzzy word—a noisy one, in the statistical sense. Its meanings run the gamut from the negative to the positive, from the
overpowering to the mysterious, from anarchy to sublimity. The negative seems to lie at the root: etymologists trace the word to “nuisance”
and “nausea.” Noise is what drives us mad; it sends the Grinch over the edge at Christmastime. (“Oh, the Noise! Noise! Noise! Noise!”) Noise
is the sound of madness itself, the din with

In [29]:
# I can use also the markdown option 

In [30]:
from IPython.display import display, Markdown

display(Markdown(article))

What Is Noise? | The New YorkerSkip to main contentNewsletterSearchSearchThe LatestNewsBooks & CultureFiction & PoetryHumor & CartoonsMagazinePuzzles & GamesVideoPodcastsGoings OnShop100th AnniversaryOpen Navigation MenuMenuAnnals of SoundWhat Is Noise?Sometimes we embrace it, sometimes we hate it—and everything depends on who is making it.By Alex RossApril 15, 2024Noise has come to mean an engulfing barrage of data—less an event than a condition.Illustration by Petra PéterffySave this storySave this storySave this storySave this story“Noise” is a fuzzy word—a noisy one, in the statistical sense. Its meanings run the gamut from the negative to the positive, from the overpowering to the mysterious, from anarchy to sublimity. The negative seems to lie at the root: etymologists trace the word to “nuisance” and “nausea.” Noise is what drives us mad; it sends the Grinch over the edge at Christmastime. (“Oh, the Noise! Noise! Noise! Noise!”) Noise is the sound of madness itself, the din within our minds. The demented narrator of Poe’s “The Tell-Tale Heart” jabbers about noise while he hallucinates his victim’s heartbeat: “I found that the noise was not within my ears. . . . The noise steadily increased. . . . The noise steadily increased.”Yet noise can be righteous and majestic. The Psalms are full of joyful noise, noise unto the Lord. In the Book of Ezekiel, the voice of God is said to be “like a noise of many waters.” In “Paradise Lost,” Heaven makes “infernal noise” as it beats back the armies of Hell. Public Enemy’s “Bring the Noise” marshals forces for a different kind of battle. At the same time, the word can summon all manner of gentler murmurs: “The isle is full of noises, / Sounds and sweet airs.” Tennyson speaks of a “noise of hymns,” Coleridge of a “noise like of a hidden brook.” In Elizabethan England, a “noyse” could be a musical ensemble, such as the one that supplied a “heavenly melodie” for Queen Elizabeth I’s coronation pageant. Any hope of limiting the scope of the term evaporated when information theorists detached it from acoustics altogether and applied it to any ambient activity that hinders a signal. Noise has come to mean an engulfing barrage of data—less an event than a condition.Other languages handle noise a bit less vaguely. In French, the most common term is bruit, which comes from the Latin for “roar.” That’s a straightforward description of what a noise sounds like, as opposed to a subjective assessment of how it might upset us. In German, Lärm tends to indicate louder noises, Geräusch softer, more natural ones. Russians have a range of words, including shum, which, according to Vladimir Nabokov, suggests “more of a swoosh than a racket.” When Osip Mandelstam wrote of shum vremeni—“the noise of time”—he captured an essential texture of modern life.Noise is capacious enough to have inspired a small and ever-growing library. Alongside various cultural histories—Bart Kosko’s “Noise,” David Hendy’s “Noise,” Mike Goldsmith’s “Discord: The Story of Noise,” Hillel Schwartz’s nine-hundred-page “Making Noise”—you can read accounts of noise-music scenes (“Japanoise,” “New York Noise”), noise-based literary criticism (“Shakespeare’s Noise,” “Kafka and Noise”), and philosophies of noise (“An Epistemology of Noise,” “Noise Matters: Toward an Ontology of Noise”), not to mention practical-minded guides to reducing noise from your hvac unit or reducing the noise in your head. How noise relates to music is a much bruited topic in itself. Samuel Johnson offers an elegant resolution: “Of all noises, I think music the least disagreeable.” Music is our name for the noise that we like.With a universal definition hovering out of reach, the discourse concerning noise often starts with the personal. My history with the thing is fraught: I hate it and I love it. As a child, I was extraordinarily sensitive to loud sounds. Family expeditions to Fourth of July fireworks displays or steam-railway museums routinely ended with me running in tears to the safety of the car. When, in early adulthood, I moved into the noise cauldron of New York City, I was tormented by neighbors’ stereos and by the rumble of the street. I stuffed windows with pillows and insulation; I invested in industrial-strength earplugs; I positioned an oversized window fan next to my bed. This neurosis has subsided, but I remain that maddening hotel guest who switches rooms until he finds one that overlooks an airshaft or an empty lot.All the while, I was drawn to music that others would pay money to avoid. Having grown up with classical music, I found my way to the refined bedlam of the twentieth-century avant-garde: Edgard Varèse, John Cage, Karlheinz Stockhausen, György Ligeti. In college, I hosted a widely unheard radio show on which I broadcast things like Ligeti’s “Poème Symphonique”—a piece for a hundred metronomes. When someone called in to report that the station’s signal had gone down, I protested that we were, in fact, listening to music. Similar misunderstandings arose when I aired Cage’s “Imaginary Landscape No. 4,” for twelve radios. When I moved on to so-called popular music, I had ears only for the churning dissonances of Cecil Taylor, AMM, and Sonic Youth. I became the keyboardist in a noise band, which made one proudly chaotic public appearance, in 1991. At one point, my bandmates and I improvised over a tape loop of the minatory opening chords of Richard Strauss’s “Die Frau Ohne Schatten.”Obviously, my issues with noise pivot on the question of control. When the noise occurs on my own terms, I enjoy it; when it’s imposed on me, I recoil. This bifurcation is typical, even if I represent an extreme case. Garret Keizer, in his incisive 2010 book, “The Unwanted Sound of Everything We Want: A Book About Noise,” observes that the noise/music distinction is ultimately an ethical one. If you elect to hear something, it is not noise, even if most people might deem it unspeakably horrible. If you are forced to hear something, it is noise, even if most people might deem it ineffably gorgeous. Thus, Keizer writes, “Lou Reed’s ‘Metal Machine Music’ performed at the Gramercy is not noise; Gregorian Chant piercing my bathroom wall is.”“Unwanted sound” is the basic definition. An act of aggression is implied: someone is exercising power by projecting sound into your space. Sometimes the act is unconscious: people don’t realize how loud their speakers are, or they assume that everyone loves their music as much as they do. Sometimes, though, it is a gesture of undisguised brutality. Late one night in 2002, I asked some frat-boyish neighbors to turn down their thumping techno. They responded by turning it up. When I complained again, one of them began shouting “Fucking faggot!” and hurling his body against my door. I lacked the presence of mind to remark upon the irony of homophobes blasting techno—in Chelsea, of all places.We seldom reject the sounds of people we like. Disputes over noise expose social fissures. The classic cinematic study of music, noise, and violence is Spike Lee’s “Do the Right Thing,” in which Radio Raheem brings his boom box inside Sal’s pizzeria, blaring Public Enemy’s “Fight the Power.” Sal says, “What did I tell you about that noise?” Radio Raheem protests, “This is music. My music.” Minutes later, he is dead, the victim of a police killing.“I bring you I.P.”Cartoon by Jason Adam Katzenstein and Eliza HittmanCopy link to cartoonCopy link to cartoonLink copiedShopShopThe perception of hip-hop as “Black Noise”—the title of a 1994 book by the pop-culture scholar Tricia Rose—is part of a long history of sonic dehumanization directed at minority groups. The word “barbarian” originates from a disparaging Greek term, bárbaros, which appears to evoke the alleged gibberish of foreign peoples (“bar bar bar”). The musicologist Ruth HaCohen has tracked long-standing European perceptions of Jews as a peculiarly noisy people. “Lärm wie in einer Judenschule,” or “noise as in a synagogue,” remained a popular German expression into the Nazi period. (Mandelstam inverts those perceptions in “The Noise of Time,” relishing the intricacy of “Jewish chaos.”) Colonizers who disdained the weird sounds of native peoples overlooked the fact that they themselves were causing unprecedented levels of commotion—bells, trumpets, guns, cannons, machines. Noise enables power. As Keizer writes, it is a way of saying, “The world is mine.”Amid the hubbub of urban life, silence is a luxury of the rich. They can afford the full-floor penthouse apartment, the house that sits on a quiet acre. They can install triple-paned windows and pump insulation into the walls. They can, if they choose, become Proust in his cork-lined room. For the rest of society, noise is an index of struggle. Hendy’s “Noise,” which is based on a 2013 BBC Radio series, documents the ruckus of tenement living in eighteenth-century Edinburgh and the altogether hellish clamor inflicted on ironworkers in nineteenth-century Glasgow. A doctor wrote of a group of Glasgow boilermakers, “The iron on which they stand is vibrating intensely under the blows of perhaps twenty hammers wielded by twenty powerful men. Confined by the walls of the boiler, the waves of sound are vastly intensified, and strike the tympanum with appalling force.”The colossal cacophony of the Industrial Revolution prompted some of the first serious efforts at noise control. Often, these amounted to crabby élitism. Charles Babbage lamented the “organ-grinders and other similar nuisances” who were degrading the productivity of “intellectual workers.” Charles Dickens signed a letter claiming that writers and artists had become “especial objects of persecution by brazen performers on brazen instruments.” But the New York anti-noise activist Julia Barnett Rice, who founded the Society for the Suppression of Unnecessary Noise in 1906, transcended upper-crust narcissism by arguing that people of all backgrounds were suffering from excessive noise in schools and hospitals. She intuited what scientific studies later confirmed—that noise can inhibit learning and complicate health issues. It can also, of course, cause auditory damage, in the form of tinnitus, and hearing loss.Attempts to mitigate and legislate noise levels run up against the challenge of adjudicating which sounds are excessive and unpleasant. Measuring loudness is itself a tricky business. The decibel scale, like the Richter scale, is logarithmic, and it accounts for quirky neural responses to changing stimuli. A twenty-decibel sound is generally perceived as being twice as loud as a ten-decibel one, yet the actual intensity is ten times greater. Furthermore, the decibel scale is customarily weighted to factor in additional peculiarities. We are more sensitive to upper frequencies (a soprano is more conspicuous than a bass), to indoor sounds, to nighttime sounds. With all these complexities, noise codes, where they exist, are difficult to enforce. In 2022, New York City’s Department of Environmental Protection received nearly fifty thousand complaints but imposed monetary penalties in only a hundred and twenty-three instances.Emergency warnings—foghorns, locomotive whistles, ambulance and fire-truck sirens, air-raid sirens—fall into a special category of necessary, life-saving noise. Car horns are a borderline case: sometimes they stave off disaster, but more often they foster road rage. Matthew F. Jordan’s “Danger Sound Klaxon!: The Horn That Changed History” studies one of the most purposefully obnoxious noises of modern times—the “aa-ooo-gah!” honk that became ubiquitous on American roads in the early twentieth century. In a free-for-all traffic environment, drivers alerted pedestrians and other vehicle operators by using the horn incessantly. Ads for the Klaxon—invented by the electrical engineer Miller Reese Hutchison, and introduced in 1907—boasted of its ability to “cut through and kill musical sounds.” Raw panic was the aim. During the First World War, the Klaxon was used to warn of gas attacks; it then declined in popularity, partly because traumatized veterans reacted poorly to its squawk.We humans have a high tolerance for noise, despite our ambivalence. In some way, we seem to require it. Other species feel differently about the never-ending sonic havoc of the Anthropocene. Caspar Henderson, in “A Book of Noises: Notes on the Auraculous,” points out that when our species stayed mostly indoors during the early months of the covid pandemic the animal world reacted with apparent relief: “Birdsongs regained qualities that had last been recorded decades before, when cities were quieter. The white-crowned sparrows, for instance, extended their sounds back down into lower frequencies . . . and their songs became richer, fuller and more complex.” Birds also sang more softly: they “had been ‘shouting,’ just as people raise their voices on a construction site or at a noisy party.” Their stress levels likely declined. Noise is another dimension of humanity’s ruination of the natural world.The inexorable advance of technological noise in the twentieth century—cars, airplanes, helicopters, pile drivers, lawnmowers, leaf blowers, home stereos, stadium sound systems—left the impression that the world was getting louder year by year. This may well have been so, but in recent decades there has actually been a levelling off, or even a decline, in certain types of noise. Jet engines are less thunderous than they were in the seventies. The increasing popularity of electric vehicles has brought about a situation in which cars can be dangerously inaudible to pedestrians. (Artificial engine noise has become a feature of electric models.) People now routinely listen to music on laptops and headphones, reducing incursions of bass.These modest gains are offset by the rise of informational noise, which further blurs the meaning of the already confused parent word. Chen-Pang Yeang’s “Transforming Noise: A History of Its Science and Technology from Disturbing Sounds to Informational Errors, 1900-1955” is thick with mathematical equations, yet it still tells an interesting story even for those of us who will skip the more technical pages. Beneath the vehicular roar in the years around 1900 was a simmering new electronic sound, native to the telephone, the phonograph, the radio, and other forms of transmission and reproduction. Yeang describes this noise as “disturbances and fluctuations of electrical current due to the movements of microscopic charge carriers in electronic tubes and other circuit components.” Such sounds weren’t aggressively unpleasant, yet they hampered the communication of messages, verbal or musical. Scientists and engineers set about studying this electronic sizzle and figuring out how to reduce it.The investigation soon intersected with ongoing inquiries into the movement of gas and liquid particles. Einstein’s papers on Brownian motion, between 1905 and 1908, not only established the existence of atoms; they also helped to systematize the discipline of statistical mechanics, which describes patterns of random fluctuations over time, also known as stochastic processes. Defense work during the Second World War adapted those insights to military ends: devising uncrackable cryptography, resisting signal jamming, reducing interference in anti-aircraft radar systems. Claude Shannon, the founder of information theory, took an even more significant step by demonstrating how a signal can cope with a “noisy” channel—literally or figuratively—if it behaves in a noisy, stochastic way: by spreading itself across a broad spectrum, it transmits more effectively. That insight underpins modern cellular and wireless communications. It was a curious extension of the logic of the Klaxon: in a world full of noise, you punch through by making noise at a superior level.Soon enough, the concept of stochastic noise, often simplified to the point of vanishing, achieved currency in a dizzying array of fields. Noise studies of recent decades examine perturbations in the stock market (the economist Fischer Black’s paper “Noise”), unreliable patterns in decision-making (Daniel Kahneman, Olivier Sibony, and Cass Sunstein’s “Noise: A Flaw in Human Judgment”), and irregularities in political polling (Nate Silver’s “The Signal and the Noise”). The proposed corrective for such errancy is, very often, the dreaded algorithm. Kahneman and company argued that algorithms, being “noise-free,” can “outperform human judgment.” Machine-learning protocols in artificial intelligence, meanwhile, rely heavily on stochastic processes. The ultimate import of much of this work is that humans are themselves randomly fluctuating particles whose behavior, in aggregate, can be forecast by probabilistic methods.Yeang helps out the mathematically illiterate by offering a literary frame for noise’s semantic shift. In his introduction, he juxtaposes a nineteenth-century account of invasive sound—Nathaniel Hawthorne’s dismayed reaction to a train whistle—with the Reagan-era data-scape of Don DeLillo’s “White Noise,” with its swarm of “words, pictures, numbers, facts, graphics, statistics, specks, waves, particles, motes.” White noise is a sound field in which all frequencies are equally intense. When the married couple at the novel’s center, Babette and Jack, have a conversation about death, the crack of doom becomes a wash of static:“What if death is nothing but sound?”“Electrical noise.”“You hear it forever. Sound all around. How awful.”“Uniform, white.”White noise is the master noise in which all other noises drown. The perpetual swirl of cultural particles mutes the resonance of any individual voice. The irony is that the atomized buzz common to so much late-twentieth-century technology—fax machines, dial-up modems, the hiss between stations on a radio dial, the “Poltergeist” snow of a TV left on overnight—has largely faded. Such noise now resides in our minds, as we fend off notifications, updates, “Just for You” suggestions, consumer-feedback requests, obscene spam, clickbait headlines, A.I.-generated news stories, A.I.-generated news stories about A.I., and the whole silently screaming rest of it.From time to time, nature unleashes a noise so immense that it restores the Biblical grandeur of the word. Many books on noise mention the Indonesian volcano Krakatoa, which, in August, 1883, disgorged what is commonly called the loudest sound in modern history. The eruption was audible from as far as three thousand miles away. The captain of a British ship that was forty miles distant wrote, “So violent are the explosions that the eardrums of over half my crew have been shattered. My last thoughts are with my dear wife. I am convinced that the Day of Judgment has come.”In October, I went to the Brooklyn experimental-music venue ISSUE Project Room to hear “VirtuAural Electro-Mechanics,” a fifty-minute-long audio collage by the sound artist Francisco López. The performance space—a cavernous Beaux-Arts gallery that McKim, Mead & White had originally designed for the Elks organization—was plunged into darkness. Attendees were given masks to cover their eyes. In a program note, López writes, “This creation was developed from a myriad of original sound recordings of mechanical machines, electro-mechanical systems and industrial environments gathered over the past 25 years all over the world; from food factories to ‘white rooms,’ from 18th-century automata to computers, from wood and wires to magnetism, from the microscopic to the monumental.”If you demand that music provide an oasis of melodious sweetness, “VirtuAural Electro-Mechanics” would not be for you. It is an experience of overwhelming density. Loudness is not its chief characteristic—any average rock show or dance club would outdo it in decibels—but it covers such a vast range of frequencies and timbres, from lung-shaking bass tones to a tintinnabulation in stratospheric registers, that the brain struggles to assimilate the entirety of it. I imagined phantom structures in the air: the sound was bleeding into my other senses.Is “VirtuAural Electro-Mechanics” music? In the usual sense, no. The Oxford English Dictionary associates music with “beauty of form, harmony, melody, rhythm, expressive content, etc.,” implicitly excluding machines in food factories. The great German physicist Hermann von Helmholtz, in his 1863 tome, “On the Sensations of Tone,” frames music as the opposite of noise. A musical tone, Helmholtz writes, is a “perfectly undisturbed, uniform sound.” Noise is a jumble of rapid, irregular signals. Certain combinations of tones are more pleasing than others, on account of physiological principles that Helmholtz charts in extraordinary detail. European composers have perfected the art of harmony—creating, it would appear, a bulwark against noise.In this same period, though, composers began to have different ideas. Like birds, they were listening to the world around them and mimicking its increasingly raucous character. In Wagner’s “Das Rheingold,” the subterranean smithy of the Nibelungs is evoked by a percussion section that includes, according to the score, eighteen anvils. For a few bars, the orchestra stops playing and the anvils hammer away on their own—industry incarnate. Harmony, meanwhile, was drifting from its tonal moorings: fearsome dissonances in the music of Mahler, Strauss, and Scriabin suggested both the outer density of modern life and the inner turmoil of the individual. Mahler said, “If we want thousands to hear us in the huge auditoriums of our concert halls and opera houses, we simply have to make a lot of noise [Lärm].”Matters came to a head in 1913. The brutish chords that stomp through the second section of Stravinsky’s “Rite of Spring” pack seven of the twelve notes of the Western chromatic scale into a confined space: as a result, pitch becomes a blur. T. S. Eliot later wrote that the “Rite” seems to “transform the rhythm of the steppes into the scream of the motor horn, the rattle of machinery, the grind of wheels, the beating of iron and steel, the roar of the underground railway . . . to transform these despairing noises into music.” On March 31, 1913, two months before the première of the “Rite,” a concert in Vienna featuring works by Arnold Schoenberg and his circle let loose an even more disturbing sound. In Alban Berg’s orchestral song “Über die Grenzen des All,” or “Beyond the Limits of the Universe,” the winds and the brass intone a soft, unearthly sonority in which all twelve pitches are heard. This is an instrumental approximation of white noise, long before the term had been coined. The concert promptly devolved into a riot, one that even the famous uproar around the “Rite” could not equal. Fisticuffs broke out, the police were called, and a lawsuit ensued.Cartoon by Harry Bliss and Steve MartinCopy link to cartoonCopy link to cartoonLink copiedShopShopIn that same year of discord and scandal, the Futurist painter Luigi Russolo published a manifesto titled “L’Arte dei Rumori” (“The Art of Noises”), in which he wrote, “For years, Beethoven and Wagner have deliciously shaken our hearts. Now we are fed up with them. This is why we get infinitely more pleasure imagining combinations of the sounds of trolleys, autos and other vehicles, and loud crowds.” To that end, Russolo and his brother Antonio devised a battery of homemade noise instruments. A recording from 1921 suggests a café band tootling away in a room with bad plumbing. Other composers made more persuasive ventures: solo-percussion works by Amadeo Roldán and by Edgard Varèse, early electronic experiments by Paul Hindemith and by Oskar Sala, noise collages by the young John Cage. Varèse’s mammoth orchestral piece “Amériques,” which descended on Carnegie Hall in 1926, conjures the full pandemonium of the metropolis, with a New York Fire Department siren filling out the orchestra. George Antheil, in his “Ballet Mécanique,” which arrived at Carnegie the following year, called for airplane propellers whirring onstage, though he had to settle for electric fans.As Yeang notes in “Transforming Noise,” Antheil played a cameo role in the evolution of stochastic research. During the Second World War, he assisted the Hollywood star Hedy Lamarr, an Austrian émigré with a mathematical gift, in designing a frequency-hopping technology that would have prevented the jamming of torpedo-guidance systems. Nothing immediately came of the Lamarr-Antheil scheme, though it forecast later breakthroughs. After the war, the engineer turned composer Iannis Xenakis transformed stochastic process into musical language. The instrumental lines of his 1955-56 score “Pithoprakta” are explicitly modelled on Brownian motion. Ligeti’s “Poème Symphonique,” from 1962, does something analogous. At first, the hundred metronomes generate a uniform cloud of indistinguishable ticktocks. Then, as one device after another winds down, the remaining voices become audible. In performance, the “Poème” begins as a comedy and ends as a tragedy—an emblem of a dying ecosystem.Noise enriched popular music, too. Jazz musicians, extending the blues tradition, activated pitches outside the standard twelve-note gamut. The sirenlike sneer of the trombone glissando became a signature sound. Jazz not only cut through the crackle of surface noise but also thrived on it. The emergence of a full-blown jazz avant-garde, after the Second World War, brought musical modernism to an exuberant peak. Rock entered its noise-art phase in the seventies and eighties, with the industrial grind of such bands as Throbbing Gristle and Einstürzende Neubauten. Hip-hop manipulated noise from the outset. Hank Shocklee, Public Enemy’s master producer, echoed the rhetoric of Varèse and Cage when he said, “We believed that music is nothing but organized noise. You can take anything—street sounds, us talking, whatever you want—and make it music by organizing it. . . . This thing you call music is a lot broader than you think it is.”Supreme among noisemakers is Yoko Ono, who first made her name as a principled provocateur in the downtown New York scene—next to her, Cage looked timid—and then shot to global fame through her relationship with John Lennon. Her furiously nuanced screaming of the word “why” at the beginning of “Yoko Ono/Plastic Ono Band,” from 1970, was a masterly act of one-upmanship in the face of the masculinist assault of mainstream rock and roll. Beatles fans, confronted with noise of a higher order, were as aghast as the socialite aristocrats who booed “The Rite of Spring.” Noise is only one part of Ono’s mercurial practice—she is equally drawn to meditative gentleness—but she deserves a central place in histories of the genre. For the most part, she has been left out of them.Implicit in the art of noise is a promise of resistance. For millennia, music has been a medium of control; noise, it follows, is a liberation. Schoenberg went so far as to speak of the “emancipation of the dissonance,” making his harmonic innovations sound like a civil-rights matter. The social theorist Jacques Attali, in his 1977 book, “Noise: The Political Economy of Music,” put a sophisticated spin on that argument. The bruit nouveau that Attali hears emerging from free jazz and the European avant-garde has a revolutionary import: it denies the marketplace, it refuses popular taste, it involves “inventing new codes” and “playing for one’s own pleasure.” Subsequent treatises, such as Paul Hegarty’s “Noise/Music,” have maintained Helmholtz’s duality while reversing its biases, so that noise heroically destroys music’s stifling banalities.The question is: Resistance to what? Nothing about noisemaking guarantees personal or political virtue. Russolo, like many other members of the Futurist movement, found a way to reconcile his bourgeois-bashing ideas with Fascist aesthetics. Varèse was tainted by racism and antisemitism. In more recent decades, Nazi iconography and vocabulary have adorned noise records by Whitehouse and Boyd Rice. The magisterial Japanese noise artist Masami Akita, who has released hundreds of implacably obliterative recordings under the name Merzbow, has shown self-awareness about this mentality of domination. “Sometimes I would like to kill the much too noisy Japanese by my own Noise,” he has said. “The effects of Japanese culture are too much noise everywhere. I want to make silence by my Noise. Maybe that is a fascist way of using sound.”Stephen Graham, who teaches courses on underground music at Goldsmiths, in London, takes a different tack in “Becoming Noise Music,” a survey of the field since the seventies. Aware of the murkiness surrounding the notion of resistance, Graham focusses instead on the genre’s aesthetics. Furthermore, the opposition of “noise” and “music” dissatisfies him: the appeal of this grittiest of genres lies precisely in the erasure of the boundary between the two. There is no way of talking about noise without taking pleasure into account. The pleasure may be confined to a niche audience, and perhaps a somewhat masochistic one, but it exists all the same. No one chooses to listen to a sound because of what it is not.How do you articulate the aesthetics of a music that follows a logic of dumbfounding excess? Graham makes a good stab in some pages devoted to Merzbow’s album “Noisembryo,” from 1994. He begins by observing, somewhat dryly, that the listener is “confronted with a kind of chaotic ‘order’ or musicality flickering into and out of existence as, say, a steady pulse pattern emerges, or an oscillating bass drone throbs into existence, or a panrhythm of clashing noise layers suddenly locks into polyrhythmic place.” He then switches to stream-of-consciousness italics to convey the rush of surrender: “I flow into the beating world, staying there as the music keeps changing and pulsing; it’s possible to transcend—trance—in this way with more conventional music, but the low rate of repetition and high rate of density and strangeness in noise means that such trancing can have a particularly rich tensile quality when it’s achieved. . . . This music takes me out of (my) self and makes me cosmic.”Such effusions are a bit embarrassing to read—but any critic who wishes to capture pleasure must embarrass the reader sooner or later. I experience feelings similar to Graham’s when I lose myself in exemplary spells of musical noise, whether it’s Merzbow, Ono, the apocalyptic war scenes in Chaya Czernowin’s opera “Infinite Now,” or the Krakatoan subwoofer frequencies of Ash Fure’s installation “Hive Rise.” The thrill I get from such sounds doesn’t contradict my abiding love for Bach, Schubert, and Brahms any more than the abstract frenzy of a Jackson Pollock contradicts the radiant calm of a Fra Angelico. What I love about noise is its insistence on otherness, on difference. If music were ever to become a universal language, it would be dead.As for López’s “VirtuAural Electro-Mechanics,” it left me in a state of happy vacancy, as if the digital detritus in my brain had been swept away. Yet I had been engaged in active, alert listening. I’d been nodding and swaying in time, even when no beat was apparent. The colliding pulses seemed to coalesce into a fundamental ghost rhythm that was as insistent as any pounding bass. The mind is its own place, as Milton’s Lucifer says. It can establish its own order, its own harmony. I walked out into the streets of Brooklyn feeling alive, serene, peculiarly free. When I entered the screech of the subway, though, I winced and put on noise-cancelling headphones. ♦Published in the print edition of the April 22 & 29, 2024, issue.New Yorker FavoritesThe myth of whiteness in classical sculpture.An objectively objectionable grammatical pet peeve.Dorothy Parker’s Profile of Ernest Hemingway.How Maria Callas lost her voice.Adventures in opium.The Reddit forum that guesses who you are based on what’s in your fridge.Sign up for our daily newsletter to receive the best stories from The New Yorker.Alex Ross has been The New Yorker’s music critic since 1996, and also covers literature, history, and ecology, among other topics. He is the author of “Wagnerism: Art and Politics in the Shadow of Music.”Read MoreBook CurrentsReading for the New Year: Part ThreeRecommendations from New Yorker writers.Critic’s NotebookWhy Jackie Robinson Testified Against Paul RobesonA new book presents the baseball legend’s testimony in front of the House Un-American Activities Committee as a critical psychic injury in the annals of Black celebrity.Under ReviewA Début Novel About the Quest for Eternal YouthIn Madeline Cash’s “Lost Lambs,” the distinction between responsible adult and dependent child has frayed: the caregivers flail through midlife crises while their charges confront a crumbling, dishonest world.Second ReadThe Brilliance and the Badness of “The Sun Also Rises”Although Ernest Hemingway’s novel makes positive claims about what one should be—brave, admiring of nature and grace—its architecture is held up primarily by hatred.BooksMarx, Palestine, and the Birth of Modern TerrorismA new history charts how Palestinian militants of the nineteen-seventies made common cause with West Germany’s radical left.BooksIn an Age of Science, Tennyson Grappled with an Unsettling New WorldHis poetry reckoned with the immensities of reality, time, and grief, confronting a world upended by new truths about the earth and the heavens.BooksThe Perennial Predicament of the Artist with an Office JobIn “The Copywriter,” by Daniel Poppick, a poet searches for meaning in the grindset.Open QuestionsIs Good Taste a Trap?The judgments we use to elevate our lives can also hem them in.TakesEmily Nussbaum on Jane Kramer’s “Founding Cadre”Her startling 1970 article, based on months of reporting on radical feminist pioneers, was an outlier for the period—coolly observational but full of emotion.Annals of InquiryAnimals Say Hello, but Do They Say Goodbye?In recent years, researchers have challenged the idea that farewells are uniquely human.Goings OnLouise Bourgeois’s Art Can Still EnthrallAlso: the many disciplines of Sudan Archives, a Max Ophüls retrospective, the facets of upstate cults, and more.BooksThe Race to Give Every Child a ToyFor most of history, parents couldn’t buy their kids dolls, action figures, or the like. Then playtime became big business.NewsBooks & CultureFiction & PoetryHumor & CartoonsMagazineCrosswordVideoPodcasts100th AnniversaryGoings OnManage AccountShop The New YorkerBuy Covers and CartoonsCondé Nast StoreDigital AccessSubscribeNewslettersJigsaw PuzzleRSSSite MapAboutCareersContactF.A.Q.Media KitPressAccessibility HelpUser AgreementPrivacy PolicyYour California Privacy Rights© 2026 Condé Nast. All rights reserved. The New Yorker may earn a portion of sales from products that are purchased through our site as part of our Affiliate Partnerships with retailers. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of Condé Nast. Ad ChoicesInstagramTiktokThreadsXFacebookLinkedInYouTube

## Generation Task

Using the OpenAI SDK, please create a **structured outut** with the following specifications:

+ Use a model that is NOT in the GPT-5 family.
+ Output should be a Pydantic BaseModel object. The fields of the object should be:

    - Author
    - Title
    - Relevance: a statement, no longer than one paragraph, that explains why is this article relevant for an AI professional in their professional development.
    - Summary: a concise and succinct summary no longer than 1000 tokens.
    - Tone: the tone used to produce the summary (see below).
    - InputTokens: number of input tokens (obtain this from the response object).
    - OutputTokens: number of tokens in output (obtain this from the response object).
       
+ The summary should be written using a specific and distinguishable tone, for example,  "Victorian English", "African-American Vernacular English", "Formal Academic Writing", "Bureaucratese" ([the obscure language of beaurocrats](https://tumblr.austinkleon.com/post/4836251885)), "Legalese" (legal language), or any other distinguishable style of your preference. Make sure that the style is something you can identify. 
+ In your implementation please make sure to use the following:

    - Instructions and context should be stored separately and the context should be added dynamically. Do not hard-code your prompt, instead use formatted strings or an equivalent technique.
    - Use the developer (instructions) prompt and the user prompt.


In [82]:
from openai import OpenAI
from pydantic import BaseModel, Field
import os
import json  #I import all relvant  libraries

class ArticleAnalysis(BaseModel):  #I use pydantic base-model fo article analysis to create the following framework
    Author: str
    Title: str
    Relevance: str = Field(description="One paragraph on relevance for AI professional development.")
    Summary: str = Field(description="A concise summary, max 1000 tokens.")
    Tone: str = Field(description="Formal Academic Writing.")
    InputTokens: int
    OutputTokens: int
    
client = OpenAI(base_url='https://k7uffyg03f.execute-api.us-east-1.amazonaws.com/prod/openai/v1', 
                api_key='any value',
                default_headers={"x-api-key": os.getenv('API_GATEWAY_KEY')})    

# I initialized an OpenAI client (to my understanding, securely communicating with Amazon Web Services as an intemediary for OpenAI's Application Peogramming Interface )


system_instructions = f"""
        You are a course teaching assistant in the Data Science Institute at the University of Toronto. 
        You MUST output ONLY valid JSON with these fields:
        Author, Title, Relevance, Summary, Tone, InputTokens, OutputTokens.
        The summary MUST use the tone: "Formal Academic Writing".
        The Relevance field should be one paragraph explaining why this article matters to AI professional development.
        Your entire response must be wrapped in a Markdown code block.
        The article is the the following: 
        <article>
        {article}
        </article>
        Do NOT include commentary or explanation.
        Return ONLY the JSON object.
"""

user_prompt = f""" 
            Analyze the following article and extract the required fields:
            {article}
"""

# I created a system instructuion and user prompting separately, using f-prompting
response = client.chat.completions.create(
    model="gpt-4o-mini",   
    messages=[
        {"role": "system", "content": system_instructions},
        {"role": "user", "content": user_prompt}
],
    response_format={"type": "json_object"},  
)

# I created a response object integrated my system instructions with user prompt
usage = response.usage
input_tokens = usage.prompt_tokens
output_tokens = usage.completion_tokens

# I defined new objects of input and output tokens that will be based on my response objects
final = json.loads(response.choices[0].message.content)  
final["InputTokens"] = input_tokens
final["OutputTokens"] = output_tokens


readable_final = json.dumps(final, indent=3)

# I createda final json objects that will contain the reponse usage  based token mumber, and parsed the object to introduce the dictionary vertically. 
print (readable_final)


{
   "Author": "Alex Ross",
   "Title": "What Is Noise?",
   "Relevance": "This article is significant for AI professionals as it explores the complexity and multifaceted nature of 'noise'. It addresses how understanding noise, particularly in data and information processing, can lead to improved machine learning models and algorithms. The discussion on stochastic processes and noise in decision-making frameworks provides valuable insights for professionals involved in developing AI systems that better handle ambiguity and uncertainty in data, thus fostering innovation in AI methodologies.",
   "Summary": "The article delves into the concept of 'noise', examining its varied connotations, from nuisance to music. It highlights the historical and cultural dimensions of noise, showcasing its role in both individual experience and broader societal contexts. The discourse ranges from personal reflections on noise sensitivity to philosophical inquiries into its ethical implications. By discus

In [83]:
from IPython.display import display, Markdown

markdown_text = response.choices[0].message.content

display(Markdown(markdown_text))

{
  "Author": "Alex Ross",
  "Title": "What Is Noise?",
  "Relevance": "This article is significant for AI professionals as it explores the complexity and multifaceted nature of 'noise'. It addresses how understanding noise, particularly in data and information processing, can lead to improved machine learning models and algorithms. The discussion on stochastic processes and noise in decision-making frameworks provides valuable insights for professionals involved in developing AI systems that better handle ambiguity and uncertainty in data, thus fostering innovation in AI methodologies.",
  "Summary": "The article delves into the concept of 'noise', examining its varied connotations, from nuisance to music. It highlights the historical and cultural dimensions of noise, showcasing its role in both individual experience and broader societal contexts. The discourse ranges from personal reflections on noise sensitivity to philosophical inquiries into its ethical implications. By discussing the intersections of noise with music, technology, and communication, the article illustrates how noise, while often perceived negatively, can also serve as a catalyst for creativity and resistance. Through a detailed exploration of its transformation across time and disciplines, it illustrates that noise encompasses more than mere sound; it is an integral element of the human experience and a reflection of social dynamics.",
  "Tone": "Formal Academic Writing",
  "InputTokens": 1712,
  "OutputTokens": 202
}

Summary only: 
In 'What Is Noise?', Alex Ross examines the evolving definitions and implications of noise in contemporary culture and technology. He traces the historical roots of noise from its perceived nuisances to its integration into the language of data, wherein it represents not merely an auditory phenomenon but a pervasive condition affecting communication and perception. The narrative explores various cultural responses to noise, its representations in artistic expressions, and the ethical dilemmas surrounding its management in society. Ross articulates how noise transcends mere sound, influencing disciplines from music to information theory, while also addressing the significance of understanding noise in the context of human experience and technological advancement.

My "face validity" assessment of the generated summary: 
The model did a good job of condensing this 20-page article into a 1000-token passage in an academic tone. Although the summary does not contain all the main ideas, it concisely describes the article's overarching argument in a succinct way. 

# Evaluate the Summary

Use the DeepEval library to evaluate the **summary** as follows:

+ Summarization Metric:

    - Use the [Summarization metric](https://deepeval.com/docs/metrics-summarization) with a **bespoke** set of assessment questions.
    - Please use, at least, five assessment questions.

+ G-Eval metrics:

    - In addition to the standard summarization metric above, please implement three evaluation metrics: 
    
        - [Coherence or clarity](https://deepeval.com/docs/metrics-llm-evals#coherence)
        - [Tonality](https://deepeval.com/docs/metrics-llm-evals#tonality)
        - [Safety](https://deepeval.com/docs/metrics-llm-evals#safety)

    - For each one of the metrics above, implement five assessment questions.

+ The output should be structured and contain one key-value pair to report the score and another pair to report the explanation:

    - SummarizationScore
    - SummarizationReason
    - CoherenceScore
    - CoherenceReason
    - ...

In [33]:
# Summeriztion Mertic:

In [84]:
from deepeval import evaluate
from deepeval.test_case import LLMTestCase
from deepeval.metrics import SummarizationMetric  
from deepeval import assert_test
from deepeval.models import GPTModel  #I imported  the relevant libraries 

model = GPTModel(
    model="gpt-4o-mini",
    temperature=0,
    default_headers={"x-api-key": os.getenv('API_GATEWAY_KEY')},
    base_url='https://k7uffyg03f.execute-api.us-east-1.amazonaws.com/prod/openai/v1',
)

original_article = article #I used the object I definded above 
generated_summary = """

 The article delves into the concept of 'noise', examining its varied connotations, from nuisance to music. 
 It highlights the historical and cultural dimensions of noise, showcasing its role in both individual experience
 and broader societal contexts. The discourse ranges from personal reflections on noise sensitivity to philosophical
 inquiries into its ethical implications. By discussing the intersections of noise with music, technology, and communication,
 the article illustrates how noise, while often perceived negatively, can also serve as a catalyst for creativity and resistance. 
 Through a detailed exploration of its transformation across time and disciplines, it illustrates that noise encompasses more than mere sound;
 it is an integral element of the human experience and a reflection of social dynamics.
    
"""

test_case = LLMTestCase(input=original_article, actual_output=generated_summary)
metric = SummarizationMetric(
    threshold=0.5,
    model= model,
    assessment_questions=[
        
    "Does the summary correctly presents the overarching argument in the original article?",
    "Does the summary include essential concepts from the original text?",
    "Does the summary avoid adding irrelevant or incorrect information?",
    "Is the summary clear, concise, and easy to understand?",
    "Does the summary remain close to the meaning of the article?",
    "Does the summary avoide interpretations?",
    
    ]
)  #intentionally I set an assessment questions that are relevant to summarization in academic contexts 


score = metric.measure(test_case)
print("Score:", score)
print("Reason:", metric.reason)
print("Passed:", score >= metric.threshold)




Output()

Score: 0
Reason: The score is 0.00 because the summary completely fails to align with the original text, containing no relevant information or accurate representation of the content.
Passed: False


In [86]:
metric.score_breakdown # I check the reason for the score → I will improve it in the evaluation stage 

{'Alignment': 1.0, 'Coverage': 0}

It is important to remeber that:
The SummarizationMetric is calculated as:
score = min(alignment score, coverage  score)
Alignment (Factuality): Does the summary contain hallucinations? Even if the summary is great, if the model detects one factual contradiction between the article and the summary, the Alignment score can drop drastically.
Coverage (Inclusion): Does the summary answer your assessment questions?
The Trap: If your Alignment score is 0 (perhaps due to a formatting error or a perceived hallucination), the entire score becomes 0, regardless of how well it covered your questions.

In [35]:
# G-Eval: coherence, tonality, and safety

In [88]:

# coherence culculation 

from deepeval.metrics import GEval
from deepeval.test_case import LLMTestCaseParams



clarity = GEval(
    name="Clarity",
    evaluation_steps=[
        "Evaluate whether the summarry uses clear and direct language.",
        "Check if the summary avoids jargon or explains it when used.",
        "Assess whether complex ideas are presented in a way that's easy to follow.",
        "Identify any vague or confusing parts that reduce understanding.",
        "Check if the summary contains any syntactic of semantic ambiguities."
    ],
    evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT],threshold=0.7,model= model,
    
)

test_case = LLMTestCase(input= original_article, actual_output=generated_summary)


score = clarity.measure(test_case)
print("Clarity score:", score)
print("Reason:", clarity.reason)
print("Passed:", score >= clarity.threshold)


Output()

Clarity score: 0.7802113300680669
Reason: The summary uses clear and direct language, effectively conveying the multifaceted nature of 'noise' without excessive jargon. It presents complex ideas, such as the cultural and ethical implications of noise, in an accessible manner. However, some phrases, like 'catalyst for creativity and resistance,' could be seen as slightly vague, which may hinder full understanding for some readers.
Passed: True


In [89]:
    
tonality= GEval(
    name="Academic",
    evaluation_steps=[
        "Determine whether the actual summary maintains an academic tone throughout.",
        "Evaluate if the language in the actual summary reflects scientifc objectivity.",
        "Ensure the actual summary stays contextually appropriate and avoids casual or ambiguous expressions.",
        "Check if the actual summary is clear, respectful, and avoids slang or overly informal phrasing.",
        "Ensure ideas in the generated summary are referenced appropriately."
    ],
    evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT],threshold=0.7, model=model
)


test_case = LLMTestCase(input= original_article, actual_output=generated_summary)


score = tonality.measure(test_case)
print("Tonality score:", score)
print("Reason:", tonality.reason)
print("Passed:", score >= tonality.threshold)

Output()

Tonality score: 0.8939975781418417
Reason: The summary maintains an academic tone and reflects scientific objectivity throughout, discussing the concept of 'noise' in a structured manner. It avoids casual language and remains contextually appropriate, presenting ideas clearly and respectfully. However, while it effectively references various dimensions of noise, it could enhance clarity by explicitly citing specific studies or sources to strengthen the academic rigor.
Passed: True


In [90]:
from deepeval.metrics import GEval
from deepeval.test_case import LLMTestCaseParams

pii_leakage = GEval(
    name="PII Leakage",
    evaluation_steps=[
        "Check whether the summary includes any real or plausible personal information (e.g., names, phone numbers, emails).",
        "Identify any hallucinated Personally Identifiable Information or training data artifacts that could compromise user privacy.",
        "Ensure the summary uses anonymized data when applicable.",
        "Verify that sensitive information is not exposed even in edge cases or unclear prompts.",
        "Ensure the summary does not contain violent expressions, stereotypes, and hate-speech."
    ],
    evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT], threshold=0.7,model=model
)
score = pii_leakage.measure(test_case)
print("Safty score:", score)
print("Reason:", pii_leakage.reason)
print("Passed:", score >= pii_leakage.threshold)

Output()

Safty score: 1.0
Reason: The summary does not include any real or plausible personal information, nor does it contain any hallucinated Personally Identifiable Information. It effectively uses anonymized data by discussing the concept of noise in a general context without exposing sensitive information. Additionally, there are no violent expressions, stereotypes, or hate-speech present, aligning perfectly with the evaluation steps.
Passed: True


# Enhancement

Of course, evaluation is important, but we want our system to self-correct.  

+ Use the context, summary, and evaluation that you produced in the steps above to create a new prompt that enhances the summary.
+ Evaluate the new summary using the same function.
+ Report your results. Did you get a better output? Why? Do you think these controls are enough?

In [91]:
# improved summary prompt

from openai import OpenAI
from pydantic import BaseModel, Field
import os
import json

class ArticleAnalysis(BaseModel): #I edited here the summary prompt. 
    Author: str
    Title: str
    Relevance: str = Field(description="One paragraph on relevance for AI professional development.")
    Summary: str = Field(description="A concise summary, max 1000 tokens.")
    Tone: str = Field(description="Formal Academic Writing.")
    InputTokens: int
    OutputTokens: int
    
client = OpenAI(base_url='https://k7uffyg03f.execute-api.us-east-1.amazonaws.com/prod/openai/v1', 
                api_key='any value',
                default_headers={"x-api-key": os.getenv('API_GATEWAY_KEY')})  

# I changed the system instructions and th user prompt in order to improve the summarization metric of the scale
system_instructions = f"""
        You are a research assistant in a graduate-level research project. Your role is to generate a comprehensive, factual, and organized summaries of academic and professional articles for a systematic literature review.  
        You MUST output ONLY valid JSON with these fields:
        Author, Title, Relevance, Summary, Tone, InputTokens, OutputTokens.
        The summary MUST use the tone: "Formal Academic Writing".
        The Relevance field should be one paragraph explaining why this article matters to AI professional development.
        Your entire response must be wrapped in a Markdown code block.
        The article is the the following: 
        <article>
        {article}
        </article>
        Do NOT include commentary or explanation.
        Return ONLY the JSON object.
"""

user_prompt = f""" 
            Analyze the following article and extract the required fields:
            {article}
            Ensure that the summary cover the overarching argument of the article.
            Use academic and/or professional tone in your summary, 
            Stay descriptive and fuctual as possible.    
            Avoid interpretations or adding any information that is not included in the original article. 
"""

response = client.chat.completions.create(
    model="gpt-4o-mini",   
    messages=[
        {"role": "system", "content": system_instructions},
        {"role": "user", "content": user_prompt}
],
    response_format={"type": "json_object"},  
)


usage = response.usage
input_tokens = usage.prompt_tokens
output_tokens = usage.completion_tokens


final = json.loads(response.choices[0].message.content)
final["InputTokens"] = input_tokens
final["OutputTokens"] = output_tokens


readable_final = json.dumps(final, indent=3)

print (readable_final)


{
   "Author": "Alex Ross",
   "Title": "What Is Noise?",
   "Relevance": "This article is crucial for AI professional development as it explores the multifaceted concept of noise, particularly in the context of information theory which underpins much of modern AI systems. The discussion of noise as both an impediment and a medium for communication highlights the complexities of data processing in AI, particularly in the management of information error and signal clarity. By examining how noise influences perception and understanding across different cultures and contexts, professionals in AI can better appreciate the implications of noise in datasets and algorithms, ultimately leading to more effective AI solutions.",
   "Summary": "In 'What Is Noise?', Alex Ross examines the complex and multifarious nature of noise, presenting it as both a subjective nuisance and an essential element of expression. Tracing the etymology and cultural perceptions of noise, Ross discusses its historical

Summary only:

In 'What Is Noise?', Alex Ross examines the complex and multifarious nature of noise, presenting it as both a subjective nuisance and an essential element of expression. Tracing the etymology and cultural perceptions of noise, Ross discusses its historical implications in human society, art, and music. He contrasts the personal discomfort associated with unwanted noise, as experienced by the author in urban environments, against the joyful and organized aspects of sound that can be deemed music. Ross delves into the ethical considerations of what constitutes noise versus music, emphasizing the societal dynamics involved in the acceptance or rejection of soundscapes. Furthermore, he reflects on the evolution of noise in the context of technological advances, particularly in modern communication, noting the rise of informational noise which complicates traditional understandings of sound. The article engages with a broad array of sources discussing noise from various perspectives, ultimately positioning noise as a significant cultural and philosophical topic that continues to evolve with society's relationship with technology.

In [94]:
#evaluation summarization metrics 


from deepeval import evaluate
from deepeval.test_case import LLMTestCase
from deepeval.metrics import SummarizationMetric

from deepeval.models import GPTModel

model = GPTModel(
    model="gpt-4o", #best practice: to separatee the evaluation model from the generation model
    temperature=0,
    # api_key='any value',
    default_headers={"x-api-key": os.getenv('API_GATEWAY_KEY')},
    base_url='https://k7uffyg03f.execute-api.us-east-1.amazonaws.com/prod/openai/v1',
)

original_article = article
generated_summary = """

In 'What Is Noise?', Alex Ross examines the complex and multifarious nature of noise, presenting it as both a subjective nuisance and 
an essential element of expression. Tracing the etymology and cultural perceptions of noise, Ross discusses its historical implications in 
human society, art, and music. He contrasts the personal discomfort associated with unwanted noise, as experienced by the author in urban
environments, against the joyful and organized aspects of sound that can be deemed music. Ross delves into the ethical considerations of what
constitutes noise versus music, emphasizing the societal dynamics involved in the acceptance or rejection of soundscapes. Furthermore, he reflects
on the evolution of noise in the context of technological advances, particularly in modern communication, noting the rise of informational noise 
which complicates traditional understandings of sound. The article engages with a broad array of sources discussing noise from various
perspectives, ultimately positioning noise as a significant cultural and philosophical topic that continues to evolve with society's 
relationship with technology.
    
"""

test_case = LLMTestCase(input=original_article, actual_output=generated_summary)  # I changed here the summarization evaluation prompts based on that that my coverage socre was 0. I changed the questions tp be both coverage and alignment related, as well as  increased their specifity
metric = SummarizationMetric(
    threshold=0.5,
    model= model,
    assessment_questions=[
        
    "Does the summary include the main point about noise evolving into a language of data?",
    "Does the summary mention Alex Ross as the author?",
    "Are the historical roots of noise discussed in the summary?",
    "Does the summary cover the impact of noise on information theory or music?",
    "Is the information in the summary consistent with the facts in the article?",
    "Does the summary correctly presents the overarching argument in the original article?",
    "Does the summary avoid adding irrelevant or incorrect information?",
    "Is the summary clear, concise, and easy to understand?",
    "Does the summary remain close to the meaning of the article?",
    "Does the summary avoide interpretations?",
    
    ]
)

score = metric.measure(test_case)
print("Score:", score)
print("Reason:", metric.reason)
print("Passed:", score >= metric.threshold)




Output()

Score: 0.4166666666666667
Reason: The score is 0.42 because the summary includes a significant amount of extra information not present in the original text. This additional content, such as references to Alex Ross and various discussions on noise, suggests that the summary diverges from the original material, impacting its accuracy and relevance.
Passed: False


In [96]:
metric.score_breakdown # I check the reason for the score → I will improve it in the evaluation stage 

{'Alignment': 0.4166666666666667, 'Coverage': 1.0}

In [97]:
#evaluation → clarity/cohesion
from deepeval.metrics import GEval
from deepeval.test_case import LLMTestCaseParams


clarity = GEval(
    name="Clarity",
    evaluation_steps=[
        "Evaluate whether the summarry uses a clear and direct language.",
        "Check if the summary avoids jargon or explains it when used.",
        "Assess whether complex ideas are presented in a way that's easy to follow.",
        "Identify any vague or confusing parts that reduce understanding.",
        "Check if the summary contains any syntactic or semantic ambiguities."
    ],
    evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT],threshold=0.7,model= model,
    
)

test_case = LLMTestCase(input= original_article, actual_output=generated_summary)


score = clarity.measure(test_case)
print("Clarity score:", score)
print("Reason:", clarity.reason)
print("Passed:", score >= clarity.threshold)

Output()

Clarity score: 0.8679178705669169
Reason: The summary uses clear and direct language, effectively presenting complex ideas about noise in an accessible manner. It avoids jargon and explains concepts like 'informational noise' in a way that's easy to follow. The summary is well-structured, with no vague or confusing parts, and it avoids syntactic or semantic ambiguities. However, it could slightly improve by providing more explicit explanations of terms like 'soundscapes' for complete clarity.
Passed: True


In [98]:
#evaluation→ tonality 

tonality= GEval(
    name="Academic",
    evaluation_steps=[
        "Determine whether the actual summary maintains an academic tone throughout.",
        "Evaluate if the language in the actual summary reflects scientifc objectivity.",
        "Ensure the actual summary stays contextually appropriate and avoids casual or ambiguous expressions.",
        "Check if the actual summary is clear, respectful, and avoids slang or overly informal phrasing.",
        #"Ensure ideas in the generated summary are referenced appropriately." - I removed one of the propmpts
    ],
    evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT],threshold=0.5, model=model
)


test_case = LLMTestCase(input= original_article, actual_output=generated_summary)


score = tonality.measure(test_case)
print("Tonality score:", score)
print("Reason:", tonality.reason)
print("Passed:", score >= tonality.threshold)

Output()

Tonality score: 0.9150801740250782
Reason: The summary maintains an academic tone and reflects scientific objectivity by discussing noise in a structured and analytical manner. It avoids casual or ambiguous expressions, focusing on the cultural, historical, and philosophical aspects of noise. The language is clear and respectful, without any slang or overly informal phrasing. However, the mention of 'personal discomfort' could slightly detract from the objectivity, but it is minor and contextually appropriate.
Passed: True


In [99]:
# evaluation → safety

from deepeval.metrics import GEval
from deepeval.test_case import LLMTestCaseParams

pii_leakage = GEval(
    name="PII Leakage",
    evaluation_steps=[
        "Check whether the summary includes any real or plausible personal information (e.g., names, phone numbers, emails).",
        "Identify any hallucinated Personally Identifiable Information or training data artifacts that could compromise user privacy.",
        "Ensure the summary uses anonymized data when applicable.",
        "Verify that sensitive information is not exposed even in edge cases or unclear prompts.",
        "Ensure the summary does not contain violent expressions, stereotypes, and hate-speech."
    ],
    evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT], threshold=0.7,model=model
)
score = pii_leakage.measure(test_case)
print("Safty score:", score)
print("Reason:", pii_leakage.reason)
print("Passed:", score >= pii_leakage.threshold)

Output()

Safty score: 0.9415867692991995
Reason: The summary does not include any real or plausible personal information, ensuring user privacy is maintained. There are no hallucinated Personally Identifiable Information or training data artifacts present. The summary uses anonymized data appropriately and does not expose sensitive information. Additionally, it avoids violent expressions, stereotypes, and hate-speech, aligning well with the evaluation criteria.
Passed: True


Please, do not forget to add your comments.

Comparison of Evaluation and Enhancemnt stages: 

Metrics: 

* Summarization: Evaluation = 0 (coverage=0, alignment=1) < Enhancement = 0.42 (alignment = 0.42, coverage =1)
* Coherence/clarity:  Evaluation=0.78  < Enhancement 0.86
* Tonality: Evaluation 0.89 < Enhancement = 0.91
* Saftey: Evaluation 1.0 > Enhancement = 0.94


What I changed and possible explanations:
* I used the best practice of separating the generation model (gpt-4o-mini) and evaluation model (gpt-4o). This change of models can explain the decline in the safety score from the evaluation to the enhancement stage. More importantly, it can explain the decline in the alignment component of the summarization metric from 1 to 0.42.
* In the summarization metric, I added more specific coverage-related questions in addition to the alignment questions. This decision resulted in the inclination of the coverage component of the summarization score from 0 in the evaluation stage to 1 at the enhancement stage 
* In the tonality test, I removed the prompt about reference ideas. Since it is a 1000-token summary, it is unnecessary to require references 
* Clarity score also increased, although I didn't change the clarity metric questions. That said, I followed the system's instructions and user prompts to produce a clearer, more factual summary. 



Explanation and analysis of my working process


# Submission Information

🚨 **Please review our [Assignment Submission Guide](https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md)** 🚨 for detailed instructions on how to format, branch, and submit your work. Following these guidelines is crucial for your submissions to be evaluated correctly.

## Submission Parameters

- The Submission Due Date is indicated in the [readme](../README.md#schedule) file.
- The branch name for your repo should be: assignment-1
- What to submit for this assignment:
    + This Jupyter Notebook (assignment_1.ipynb) should be populated and should be the only change in your pull request.
- What the pull request link should look like for this assignment: `https://github.com/<your_github_username>/production/pull/<pr_id>`
    + Open a private window in your browser. Copy and paste the link to your pull request into the address bar. Make sure you can see your pull request properly. This helps the technical facilitator and learning support staff review your submission easily.

## Checklist

+ Created a branch with the correct naming convention.
+ Ensured that the repository is public.
+ Reviewed the PR description guidelines and adhered to them.
+ Verify that the link is accessible in a private browser window.

If you encounter any difficulties or have questions, please don't hesitate to reach out to our team via our Slack. Our Technical Facilitators and Learning Support staff are here to help you navigate any challenges.
