# 1.1. Basic Concept

In an informal sense, in Python we do things with stuff. “Things” take the form of operations like addition and concatenation, and “stuff” refers to the objects on which we perform those operations.

Somewhat more formally, in Python, data takes the form of objects—either built-in objects that Python provides, or objects we create using Python classes or external language tools such as C extension libraries. Although we’ll firm up this definition later, objects are essentially just pieces of memory, with values and sets of associated operations. As we’ll see, everything is an object in a Python script. Even simple numbers qualify, with values (e.g., `99`), and supported operations (addition, subtraction, and so on).

## The Python Conceptual Hierarchy

From a more concrete perspective, Python programs can be decomposed into modules, statements, expressions, and objects, as follows:

1. Programs are composed of modules.
2. Modules contain statements.
3. Statements contain expressions.
4. ***Expressions create and process objects.***

## Object, Literal and Variable

*Everything* we process in Python programs is a kind of ***object***. Objects can be created by assigning *literals* to *variables*, for example,

In [5]:
x = 17
x

17

or by calling the relevant data type as a function, for example,

In [None]:
x = int(17)
print(x)

17


The term ***literal*** simply means an expression whose syntax generates an object—sometimes also called a ***constant***. Note that the term “constant” does not imply objects or variables that can never be changed (i.e., this term is unrelated to C++’s `const` or Python’s “immutable”—a topic explored in the section “Immutability”).

For instance, when you run the following code with characters surrounded by quotes:


In [None]:
'spam'

'spam'

In [21]:
type('spam')

str

you are, technically speaking, running a *literal* expression that generates and returns a new *string* ***object***. There is specific Python language syntax to make this object. Similarly, an expression wrapped in square brackets makes a *list*, one in curly braces makes a *dictionary*, and so on. 


**Variables** are simply names—created by you or Python—that are used to keep track of information in your program. 
 - Variables are created when they are first assigned values.
 - Variables are replaced with their values when used in expressions.
 - Variables must be assigned before they can be used in expressions.
 - Variables refer to objects and are never declared ahead of time.

In other words, these assignments cause the variables `a` and `b` to spring into existence automatically:

In [7]:
a = 3                  # Name created: not declared ahead of time
b = 4

## Python’s Core(Built-in) Object Types
Python’s **built-in object types** are as follows: 

 - Numbers 
 - Strings 
 - Lists 
 - Tuples 
 - Sets 
 - Files 
 - Other core types 
 - Program unit types 
 - Implementation-related types 

We usually call the other object types in the above list as **core data types**, though, because they are effectively built into the Python language—that is, there is specific expression syntax for generating most of them. 

***Program units*** such as *functions*, *modules*, and *classes*—which we’ll meet in later—are objects in Python too; they are created with *statements* and *expressions* such as `def`, `class`, `import`, and `lambda` and may be passed around scripts freely, stored within other objects, and so on. 

Python also provides a set of *implementation-related types* such as *compiled code* objects, which are generally of interest to tool builders more than application developers.

Even though, as we’ll see, there are no *type declarations* in Python, the syntax of the *expressions* you run determines the types of objects you create and use. Just as importantly, once you create an object, you bind its operation set for all time—you can perform only string operations on a string and list operations on a list. In formal terms, this means that Python is ***dynamically typed***, a model that keeps track of types for you automatically instead of requiring declaration code, but it is also ***strongly typed***, a constraint that means you can perform on an object only operations that are valid for its type.


### Why Use Built-in Types?
If you’ve used lower-level languages such as C or C++, you know that much of your work centers on implementing *objects*—also known as *data structures*—to represent the components in your application’s domain. You need to lay out memory structures, manage memory allocation, implement search and access routines, and so on. These chores are about as tedious (and error-prone) as they sound, and they usually distract from your program’s real goals.

In typical Python programs, most of this grunt work goes away. Because Python provides powerful object types as an intrinsic part of the language, there’s usually no need to code object implementations before you start solving problems. In fact, unless you have a need for special processing that built-in types don’t provide, you’re almost always better off using a built-in object instead of implementing your own. Here are some reasons why:

 - **Built-in objects make programs easy to write.** 
     For simple tasks, built-in types are often all you need to represent the structure of problem domains. Because you get powerful tools such as collections (lists) and search tables (dictionaries) for free, you can use them immediately. You can get a lot of work done with Python’s built-in object types alone.

 - **Built-in objects are components of extensions.** 
     For more complex tasks, you may need to provide your own objects using Python classes or C language interfaces. But objects implemented manually are often built on top of built-in types such as lists and dictionaries. For instance, a stack data structure may be implemented as a class that manages or customizes a built-in list.

 - **Built-in  objects  are  often  more  efficient  than  custom  data  structures.** 
     Python’s built-in types employ already optimized data structure algorithms that are implemented in C for speed. Although you can write similar object types on your own, you’ll usually be hard-pressed to get the level of performance built-in object types provide.

 - **Built-in objects are a standard part of the language.** 
     In some ways, Python borrows both from languages that rely on built-in tools (e.g., LISP) and languages that rely on the programmer to provide tool implementations or frameworks of their own (e.g., C++). Although you can implement unique object types in Python, you don’t need to do so just to get started. Moreover, because Python’s built-ins are standard, they’re always the same; proprietary frameworks, on the other hand, tend to differ from site to site.
 
In other words, not only do built-in object types make programming easier, but they’re also more powerful and efficient than most of what can be created from scratch. Regardless of whether you implement new object types, built-in objects form the core of every Python program.


## Immutability
In Python, both `str` and the basic numeric types such as `int` are ***immutable***—that is, once set, their value cannot be changed. At ﬁrst this appears to be a rather strange limitation, but Python’s syntax means that this is a non-issue in practice. 

Every string operation is defined to produce a new string as its result, because strings are immutable in Python—they cannot be changed in place after they are created. In other words, you can never overwrite the values of immutable objects. For example, you can’t change a string by assigning to one of its positions, but you can always build a new one and assign it to the same name. Because Python cleans up old objects as you go, this isn’t as inefficient as it may sound:

In [None]:
S = 'Spam'
S

'Spam'

In [None]:
S[0] = 'z'             # Immutable objects cannot be changed

TypeError: 'str' object does not support item assignment

In [None]:
S = 'z' + S[1:]        # But we can run expressions to make new objects
S

'zpam'

Every object in Python is classified as either immutable (unchangeable) or not. 
In terms of the core types, *numbers*, *strings*, and *tuples* are *immutable*; 
*lists*, *dictionaries*, and *sets* are not—they can be changed *in place* freely, 
as can most new objects you’ll code with classes. 



In [None]:
L = ['s', 'p', 'a', 'm', 3, 4]    # A list Object
L

['s', 'p', 'a', 'm', 3, 4]

In [None]:
L[0] = 'z'                        # list object can be changed in place
L

['z', 'p', 'a', 'm', 3, 4]

This distinction turns out to be crucial in Python work, in ways that we can’t yet fully explore. 
Among other things, immutability can be used to guarantee that an object remains constant throughout your program; 
mutable objects’ values can be changed at any time and place (and whether you expect it or not).


## Object References

For example, when we say this to assign a variable a value:

In [12]:
a = 3                # Assign a name to an object

at least conceptually, Python will perform three distinct steps to carry out the request. 
These steps reflect the operation of all assignments in the Python language:
1. Create an object to represent the value `3`.
2. Create the variable `a`, if it does not yet exist.
3. Link the variable `a` to the new object `3`.

The net result will be a structure inside Python that resembles the figure below:

![Variable reference](../../imgs/variable_reference.jpg)
:label:`fig_var_ref`

Variables and objects are stored in different parts of memory and 
are associated by links (the link is shown as a pointer in the figure). 
Variables always link to objects and never to other variables, 
but larger objects may link to other objects (for instance, 
a list object has links to the objects it contains). 
These links from variables to objects are called ***references*** in Python—that is, 
a reference is a kind of association, implemented as a pointer in memory.

Whenever the variables are later used (i.e., referenced), Python automatically follows the variable-to-object links. This is all simpler than the terminology may imply. In concrete terms:
 - Variables are entries in a system table, with spaces for links to objects.
 - Objects are pieces of allocated memory, with enough space to represent the values for which they stand.
 - References are automatically followed pointers from variables to objects.

At least conceptually, each time you generate a new value in your script by running an expression, Python creates a new object (i.e., a chunk of memory) to represent that value. As an optimization, Python internally caches and reuses certain kinds of unchangeable objects, such as small integers and strings (each 0 is not really a new piece of memory—more on this caching behavior later). But from a logical perspective, it works as though each expression’s result value is a distinct object and each object is a distinct piece of memory.

Technically speaking, objects have more structure than just enough space to represent their values. Each object also has two standard header fields: a ***type designator*** used to mark the type of the object, and a ***reference counter*** used to determine when it’s OK to reclaim the object. To understand how these two header fields factor into the model, we need to move on.


## Types Live with Objects, Not Variables
To see how object types come into play, watch what happens if we assign a variable multiple times:


In [None]:
a = 3             # It's an integer
a = 'spam'        # Now it's a string
a = 1.23          # Now it's a floating point

This isn’t typical Python code, but it does work—`a` starts out as an integer, 
then becomes a string, and finally becomes a floating-point number. 
This example tends to look especially odd to ex-C programmers, 
as it appears as though the type of a changes from integer to string when we say `a = 'spam'`.

However, that’s not really what’s happening. 
In Python, things work more simply. Names have no types; as stated earlier, 
types live with objects, not names. In the preceding listing, 
we’ve simply changed `a` to reference different objects. 
Because variables have no type, we haven’t actually changed the type of the variable `a`; 
we’ve simply made the variable reference a different type of object. 
In fact, again, all we can ever say about a variable in Python is that 
it references a particular object at a particular point in time.

Objects, on the other hand, know what type they are—
each object contains a header field that tags the object with its type. 
The integer object `3`, for example, will contain the value `3`, 
plus a designator that tells Python that the object is an integer 
(strictly speaking, a pointer to an object called `int`, the name of the integer type). 
The type designator of the `'spam'` string object points to the string type (called `str`) instead. 
Because objects know their types, variables don’t have to.

To recap, types are associated with objects in Python, not with variables. 
In typical code, a given variable usually will reference just one kind of object. 
Because this isn’t a requirement, though, 
you’ll find that Python code tends to be much more flexible 
than you may be accustomed to—if you use Python well, 
your code might work on many types automatically.

It was mentioned that objects have two header fields, 
a *type designator* and a *reference counter*. 
To understand the latter of these, 
we need to move on and take a brief look at what happens at the end of an object’s life.



## Garbage-Collection
In the prior section’s listings, 
we assigned the variable `a` to different types of objects in each assignment. 
But when we reassign a variable, what happens to the object it was previously referencing? 
For example, after the following statements, what happens to the object `3`?


In [None]:
a = 3
a = 'spam'

The answer is that in Python, whenever a name is assigned to a new object, 
the space held by the prior object is reclaimed if it is not referenced by any other name or object. 
This automatic reclamation of objects’ space is known as ***garbage collection***, 
and makes life much simpler for programmers of languages like Python that support it.

To illustrate, consider the following example, which sets the name x to a different object on each assignment:


In [11]:
x = 42
x = 'shrubbery'          # Reclaim 42 now (unless referenced elsewhere)
x = 3.1415               # Reclaim 'shrubbery' now
x = [1, 2, 3]            # Reclaim 3.1415 now

First, notice that `x` is set to a different type of object each time. 
Again, though this is not really the case, 
the effect is as though the type of `x` is changing over time. 
Remember, in Python types live with objects, not names. 
Because names are just generic references to objects, 
this sort of code works naturally.

Second, notice that references to objects are discarded along the way. 
Each time `x` is assigned to a new object, Python reclaims the prior object’s space. 
For instance, when it is assigned the string `'shrubbery'`, 
the object `42` is immediately reclaimed (assuming it is not referenced anywhere else)—
that is, the object’s space is automatically thrown back into the free space pool, 
to be reused for a future object.

Internally, Python accomplishes this feat by keeping a counter in every object 
that keeps track of the number of references currently pointing to that object. 
As soon as (and exactly when) this counter drops to zero, 
the object’s memory space is automatically reclaimed. 

In the preceding listing, we’re assuming that each time `x` is assigned to a new object, 
the prior object’s reference counter drops to zero, causing it to be reclaimed.
The most immediately tangible benefit of *garbage collection* is that 
it means you can use objects liberally without ever needing to allocate or free up space in your script. 
Python will clean up unused space for you as your program runs. 
In practice, this eliminates a substantial amount of bookkeeping code 
required in lower-level languages such as C and C++.



## Shared References
So far, we’ve seen what happens as a single variable is assigned references to objects. 
Now let’s introduce another variable into our interaction and watch what happens to its names and objects:


In [15]:
a = 3
b = a

a, b

(3, 3)

Typing these two statements generates the scene captured in the figure below. 
The second command causes Python to create the variable `b`; 
the variable `a` is being used and not assigned here, 
so it is replaced with the object it references (`3`), 
and `b` is made to reference that object. 
The net effect is that the variables `a` and `b` wind up referencing the same object 
(that is, pointing to the same chunk of memory).

![Shared References](../../imgs\shared_references_1.jpg)

This scenario in Python—with multiple names referencing the same object—
is usually called a ***shared reference*** (and sometimes just a ***shared object***). 
Note that the names `a` and `b` are not linked to each other directly when this happens; 
in fact, there is no way to ever link a variable to another variable in Python. 
Rather, both variables point to the same object via their references.

Next, suppose we extend the session with one more statement:


In [16]:
a = 3
b = a
a = 'spam'

a, b

('spam', 3)

As with all Python assignments, this statement simply makes a new object to represent the string value `'spam'` 
and sets a to reference this new object. It does not, however, change the value of `b`; 
`b` still references the original object, the integer `3`. 
The resulting reference structure is shown in the figure below.

![Shared References_2](../../imgs\shared_references_2.jpg)

The same sort of thing would happen if we changed `b` to `'spam'` instead—the assignment would change only `b`, not `a`. 
This behavior also occurs if there are no type differences at all. For example, consider these three statements:



In [17]:
a = 3
b = a
a = a + 2

a, b

(5, 3)

In this sequence, the same events transpire. 
Python makes the variable `a` reference the object `3` and 
makes `b` reference the same object as `a`; 
as before, the last assignment then sets `a` to a completely different object 
(in this case, the integer `5`, which is the result of the `+` expression). 
It does not change `b` as a side effect. In fact, 
there is no way to ever overwrite the value of the object `3`. 
Integers are *immutable* and thus can never be changed in place.

One way to think of this is that, unlike in some languages, 
in Python variables are always pointers to objects, not labels of changeable memory areas: 
setting a variable to a new value does not alter the original object, 
but rather causes the variable to reference an entirely different object. 
The net effect is that assignment to a variable itself can impact 
only the single variable being assigned. 
When *mutable* objects and *in-place* changes enter the equation, 
though, the picture changes somewhat; to see how, let’s move on.


## Shared References and In-Place Changes
There are objects and operations 
that perform *in-place* object changes—Python’s *mutable* types, 
including lists, dictionaries, and sets. 
For instance, an assignment to an offset in a list 
actually changes the list object itself in place, 
rather than generating a brand-new list object.

This distinction can matter much in your programs. 
For objects that support such in-place changes, 
you need to be more aware of shared references, 
since a change from one name may impact others. 
Otherwise, your objects may seem to change for no apparent reason. 
Given that all assignments are based on references 
(including function argument passing), 
it’s a pervasive potential.

To illustrate, let’s take another look at the list objects. 
Lists, which do support in-place assignments to positions, 
are simply collections of other objects, coded in square brackets:


In [25]:
L1 = [2, 3, 4]
L2 = L1

L1, L2

([2, 3, 4], [2, 3, 4])

`L1` here is a list containing the objects `2`, `3`, and `4`. 
Items inside a list are accessed by their positions, 
so `L1[0]` refers to object `2`, the first item in the list `L1`. 
Of course, lists are also objects in their own right, just like integers and strings. 
After running the two prior assignments, 
`L1` and `L2` reference the same shared object, 
just like `a` and `b` in the prior example. 
Now say that, as before, we extend this interaction to say the following:


In [26]:
L1 = 24
L1, L2

(24, [2, 3, 4])

This assignment simply sets `L1` to a different object; `L2` still references the original list.

If we change this statement’s syntax slightly, however, it has a radically different effect:


In [27]:
L1 = [2, 3, 4]        # A mutable object
L2 = L1               # Make a reference to the same object
L1[0] = 24            # An in-place change

L1, L2                # L1 is different, but so is L2!


([24, 3, 4], [24, 3, 4])

Really, we haven’t changed `L1` itself here; 
we’ve changed a component of the object that `L1` references. 
This sort of change overwrites part of the list object’s value ***in place***. 
Because the list object is shared by (referenced from) other variables, 
though, an in-place change like this doesn’t affect only `L1`—that is, 
you must be aware that when you make such changes, 
they can impact other parts of your program. 

In this example, the effect shows up in `L2` as well 
because it references the same object as `L1`. 
Again, we haven’t actually changed `L2`, either, 
but its value will appear different 
because it refers to an object that has been overwritten in place.

This behavior only occurs for *mutable* objects that support *in-place* changes, 
and is usually what you want, but you should be aware of how it works, so that it’s expected. 

It’s also just the default: if you don’t want such behavior, 
you can use the `copy` method or the built-in `list` function.

In [30]:
L1 = [2, 3, 4]
L2 = L1.copy()            # Make a copy of L1 (or list(L1))
L1[0] = 24

L1, L2                    # L2 is not changed

([24, 3, 4], [2, 3, 4])


Here, the change made through `L1` is not reflected into `L2` 
because `L2` references a copy of the object `L1` references, 
not the original; that is, the two variables point to different pieces of memory.

In [31]:
L1 = [2, 3, 4]
L2 = list(L1)
L1[0] = 24

L1, L2                    # L2 is not changed

([24, 3, 4], [2, 3, 4])

## Dynamic Typing 
In sum, variables are created when assigned, can *reference* any type of object, and must be assigned before they are referenced. This means that you never need to declare names used by your script, but you must initialize names before you can update them; counters, for example, must be initialized to zero before you can add to them.

This ***dynamic typing*** model is strikingly different from the typing model of traditional languages. When you are first starting out, the model is usually easier to understand if you keep clear the distinction between names and objects. 

Of course, you don’t really need to draw name/object diagrams with circles and arrows to use Python. When you’re starting out, though, it sometimes helps you understand unusual cases if you can trace their reference structures as we’ve done here.

Moreover, even if dynamic typing seems a little abstract at this point, you probably will care about it eventually. Because everything seems to work by assignment and references in Python, a basic understanding of this model is useful in many different contexts. It works the same in assignment statements, function arguments, for loop variables, module imports, class attributes, and more. The good news is that there is just one assignment model in Python; once you get a handle on dynamic typing, you’ll find that it works the same everywhere in the language.

At the most practical level, dynamic typing means there is less code for you to write. Just as importantly, though, dynamic typing is also the root of Python’s *polymorphism*

### “Weak” References
You may occasionally see the term “weak reference” in the Python world. This is a somewhat advanced tool, but is related to the reference model we’ve explored here, and like the `is` operator, can’t really be understood without it.

In short, a weak reference, implemented by the `weakref` standard library module, is a reference to an object that does not by itself prevent the referenced object from being garbage-collected. If the last remaining references to an object are weak references, the object is reclaimed and the weak references to it are automatically deleted (or otherwise notified).
This can be useful in dictionary-based caches of large objects, for example; otherwise, the cache’s reference alone would keep the object in memory indefinitely. Still, this is really just a special-case extension to the reference model. For more details, see Python’s library manual.


