# Reading & Writing Machines

The early history of computing revolves around efforts to automate human computation -- human labor. And from Lovelace to Turing and beyond, a key concern lay in the specification and refinement of _algorithms_: methods of reducing complex calculations and other operations to explicit formal rules, rules that could, in principle, be implemented with rigor and precision by purely mechanical (and later, of course, electronic) means. 

But as a means of understanding Chat GPT and other forms of [generative AI](https://en.wikipedia.org/wiki/Generative_artificial_intelligence), a consideration of algorithms only gets us so far. In fact, when it comes to the [large language models](https://en.wikipedia.org/wiki/Large_language_model) that have captivated the public imagination, I would argue that their "unreasonable effectiveness" is less a triumph of the algorithm, than the manifestation of another strand of computation, bound up with the former, but motivated by distinct pressures and concerns. Instead of formal logic and mathematical proof, this strand draws on traditions of thinking about data, randomness, and probability. And instead of the prescription of actions, it aims at the description and prediction of aspects of the world. 

One of the most interesting moments in this tradition, in light of later developments, remains Claude Shannon's work on modeling the structure of printed English. In this interactive document, we will use the [Python programming language](https://www.python.org) to reproduce a couple of Shannon's experiments, in the hopes of pulling back the curtain a bit on what seems to many (and not unreasonably) as evidence of a ghost in the machine. But the aim is not necessarily to demystify experiences of generative AI. I, for one, do find many of these experiences haunting. But maybe the haunting doesn't happen where we at first assume.

The material that follows draws on and is inspired by my reading of Lydia Liu's _The Freudian Robot_.



## Two kinds of coding

### Programs as code(s)

We imagine computers as machines that operate on 1's and 0's. In fact, the 1's and 0's are themselves an abstraction for human convenience: digital computation happens as a series of electronic pulses: basically, switches that are either "on" or "off." (Think of counting to 10 by flipping a light switch on and off 10 times.)

Every digital representation -- everything that can be computed by a digital computer -- must be encoded, ultimately, in this binary form, translated into a sequence of pulses. 

But to make computers efficient for human use, a variety of additional layers of abstraction have been developed on top of the basic binary layer. By virtue of using computers and smartphones, we are all familiar with the concept of an interface, which instantiates a set of rules prescribing how we are to interact with the device in order to accomplish well-defined tasks. These interactions get encoded down to the level of electronic pulses (and the results of the computation are translated back into the encoding of the interface). 

A programming language is also an interface: just a text-based one. It represents a code into which we can translate our instructions for computation, in order for those instructions to be encoded further for processing. 

Let's start with a single instruction. Run the following line of Python code. You won't see any output -- that's okay.


In [1]:
answer_to_everything = 42

In the encoding specified by the Python language, the equals sign (`=`) is an instruction that loosely translates to: "Store this value (right side) somewhere in memory, and given that location in memory the provided label (left side)." The following image presents one way of imagining what happens in response to this code (with the caveat that, ultimately, the letters and numbers are represented by their binary encoding).  

[image here]

By running the previous line of code, we have created a _variable_ called `answer_to_everything`. We can use the variable to retrieve its value (for use in other parts of our program).

In [2]:
print(answer_to_everything)

42


The `print()` _function_ is a Python command that displays a value on the screen. The syntax of Python -- e.g., the name "print," as well as the parentheses that follow the function name and that enclose the _argument_, which is the thing we want to print -- is perfectly arbitrary (in the Saussurean sense). This syntax was invented by the designers of the Python language, though they drew on conventions found in other programming languages. The point is that nothing about the Python command `print(answer_to_everything)` makes its operation transparent; to know what it does, you have to know the language (or, at least, be familiar with the conventions of programming languages more generally) -- just as when learning to speak a foreign language, you can't deduce much about the meaning of the words from the way they look or sound.

However, unlike so-called "natural" languages, programming languages are, generally speaking, fully determinate. In other words, even minor deviations in syntax will usually cause errors, and errors will usually bring the whole program to a crashing halt.

In [3]:
print(answer_to_everythin)

NameError: name 'answer_to_everythin' is not defined

A misspelled variable causes Python to abort its computation. Imagine if conversation ground to a halt whenever one of the parties mispronounced a word or used a malapropism!

I like to say that Python is extremely literal. (But of course, this is merely an analogy, and a loose one. There is no room for metaphor in programming languages, at least, not as far as the computation is concerned.)

### Encoding text

As an engineer at Bell Labs, Claude Shannon wanted to find -- mathematically -- the most efficient means of encoding data for electronic transmission. Note that this task involves a rather different set of factors from those that influence the design of a programming language.

The designer of the language has the luxury of insisting that programmers adhere to the specified syntax exactly. In working in Python, we have to use the `print(42)` function to display the number `42` on the screen; if we forget the parentheses, for instance, the command won't work. But when we talk on the phone (or via Zoom, etc.), it would certainly be a hassle if we had to first translate our words into a strict, fault-intolerant code like that of Python. 

All the same, there is no digital (electronic) representation without encoding. To refer to the difference between these two types of codes, I am drawing a distinction between _algorithms_ and _data_. Shannon's work was among the first to illuminate this distinction, which remains highly relevant to the development of machine learning and generative AI.

Before we turn to Shannon's experiments with English text, let's look briefly at how Python represents text as data.

In [4]:
a_text = "Most noble and illustrious drinkers, and you thrice precious pockified blades (for to you, and none else, do I dedicate my writings), Alcibiades, in that dialogue of Plato's, which is entitled The Banquet, whilst he was setting forth the praises of his schoolmaster Socrates (without all question the prince of philosophers), amongst other discourses to that purpose, said that he resembled the Silenes."

Running the line above creates a new variable, `a_text`, and assigns it to a _string_ representing the first sentence from Francois Rabelais' early Modern novel, _Gargantua and Pantagruel_. A string is the most basic way in Python of representing text, where "text" means anything that is not to be treated purely a numeric value. 

Anything between quotation marks, in Python, is a string.

One problem with strings in Python (and other programming languages) is that they have very little structure.

A Python string is a determinate sequence of characters, where a _character_ is, a letter of a recognized alphabet, a punctuation mark, a space, etc. Each character is stored in the computer's memory as a numeric code, and from that perspective, all characters are essentially equal.

We can access a single character in a string by supplying its position. (Python counts characters in strings from left to right, starting with 0, not 1, for the first character.)

In [5]:
a_text[5]

'n'

We can access a sequence of characters -- here, the characters in positions 11 through 50.

In [6]:
a_text[10:50]

' and illustrious drinkers, and you thric'

We can even divide the string into pieces, using the occurences of particular characters. The code below divides our text on the white space, returning a _list_ (another Python construct) of smaller strings.

In [7]:
a_text.split()

['Most',
 'noble',
 'and',
 'illustrious',
 'drinkers,',
 'and',
 'you',
 'thrice',
 'precious',
 'pockified',
 'blades',
 '(for',
 'to',
 'you,',
 'and',
 'none',
 'else,',
 'do',
 'I',
 'dedicate',
 'my',
 'writings),',
 'Alcibiades,',
 'in',
 'that',
 'dialogue',
 'of',
 "Plato's,",
 'which',
 'is',
 'entitled',
 'The',
 'Banquet,',
 'whilst',
 'he',
 'was',
 'setting',
 'forth',
 'the',
 'praises',
 'of',
 'his',
 'schoolmaster',
 'Socrates',
 '(without',
 'all',
 'question',
 'the',
 'prince',
 'of',
 'philosophers),',
 'amongst',
 'other',
 'discourses',
 'to',
 'that',
 'purpose,',
 'said',
 'that',
 'he',
 'resembled',
 'the',
 'Silenes.']

The strings in the list above correspond, loosely, to the individual words in the sentence from Rabelais' text. But Python really has no concept of "word," neither in English, nor any other natural language. (Technically, the only words Python recognizes are certain reserved Python keywords, as defined by the core specification of the language.)

## Language as a drunken walk

It's probably fair to say that when Shannon was developing his mathematical approach to encoding information, the algorithmic ideal still dominated computational research in Western Europe and the United States. In previous decades, philosophers like Bertrand Russell and mathematicians like David Hilbert had sought to develop a formal approach to mathematical proof that would unify the scientific disciplines, the goal of such research being to identify a core set of axioms, or logical rules, in terms of which all other "rigorous" procedures of thought could be expressed. Working within this tradition, Alan Turing had developed his model of what would become the digital computer. 

Can language as humans use it be reduced to such formal rules? On the face of it, it's easy to think not. However, that conclusion poses a problem for computation involving human (or "natural") language, since the computer is basically a formal-rule-following machine.

Shannon's innovation was to pose 



Chat-GPT and its ilk have made the [Turing test](https://en.wikipedia.org/wiki/Turing_test) -- long a trope of science fiction and a topic of serious interest chiefly to computer scientists -- into something of a ubiqituous pastime. Certainly, those of us who regularly use the Internet as a source of information or participate in its discourse communities now face the disconcerting question: has what we're reading, seeing, listening to, etc., been produced by a human being or a computer program? How can we tell? Alan Turing proposed his test as a phenomenological benchmark: any machine that could successfully and reliably fool its human interlocutors into granting it the presumption of human intelligence could, in fact, be considered intelligent (in all relevant respects).

There's a lot to unpack in Turing's philosophical exercise. But as a tool for understanding how [generative AI](https://en.wikipedia.org/wiki/Generative_artificial_intelligence) works, or at least, for approaching the ground from which it springs, Turing's work is arguably less useful than that of his less celebrated contemporary, Claude Shannon.

Working at Bell Labs in the 1940's, Claude Shannon developed the [mathematical theory of communication](https://people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf). Often referred to as a "theory of _information_," it is noteworthy that Shannon framed his work as a theory of _communication_. Regardless, the practical significance of Shannon's work is [immense](https://www.quantamagazine.org/how-claude-shannons-information-theory-invented-the-future-20201222/): the mathematical modeling introduced there underpins great swaths of modern telecommunications infrastructure, and it paved the way to our data-saturated digital mediascape. 

Shannon's model is motivated by a practical question: how can we determine the _most efficient means of encoding_ a given message? And although his model has proven relevant to any medium that can be represented digitally, his work was grounded, as Lydia Liu shows, in questions about language, specifically, _printed English_.

### From Claude Shannon to Chat GPT (and back again)

If Turing's guiding concern was to know what kinds of intellectual activity could be automated, Shannon's was quite different: to know whether human language, in its panoply of uses, could be modeled as a [stochastic (random) process](https://en.wikipedia.org/wiki/Stochastic_process). Turing's point of departure lay in the resources of formal logic and mathematical proof; Shannon drew on data and probability. 

Although forms of popular (and even scholarly) imagining about AI continue to draw on a Turing-esque framework, wherein the primary concern is with the meaning of intelligence, the "unreasonable effectiveness" of [large language models](https://en.wikipedia.org/wiki/Large_language_model) hearkens back to Shannon's experiments on the probabilistic modeling of English prose. And while we certainly couldn't build Siri or Chap-GPT using just Shannon's insights, his methods might be regarded as an early exercise in machine learning. Could we also say that the digital humanities treads this same ground?

In this interactive document, we'll use the [Python programming language] to reproduce a couple of Shannon's experiments, in the hopes of pulling back the curtain a bit on what seems to many (and not unreasonably) as evidence of a ghost in the machine. But the aim is not necessarily to demystify experiences of generative AI. I, for one, do find many of these experiences haunting, but I'm not sure the haunting happens where AI's prominent boosters claim that it does.

The material that follows draws on and is inspired by my reading of Lydia Liu's _The Freudian Robot_.