<!--BOOK_INFORMATION-->
<img align="left" style="padding-right:10px;" src="figures/PDSH-cover-small.png">

*This notebook contains an excerpt from the [Python Data Science Handbook](http://shop.oreilly.com/product/0636920034919.do) by Jake VanderPlas; the content is available [on GitHub](https://github.com/jakevdp/PythonDataScienceHandbook).*

*The text is released under the [CC-BY-NC-ND license](https://creativecommons.org/licenses/by-nc-nd/3.0/us/legalcode), and code is released under the [MIT license](https://opensource.org/licenses/MIT). If you find this content useful, please consider supporting the work by [buying the book](http://shop.oreilly.com/product/0636920034919.do)!*

<!--NAVIGATION-->
< [Introduction to NumPy](02.00-Introduction-to-NumPy.ipynb) | [Contents](Index.ipynb) | [The Basics of NumPy Arrays](02.02-The-Basics-Of-NumPy-Arrays.ipynb) >

<a href="https://colab.research.google.com/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/02.01-Understanding-Data-Types.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open and Execute in Google Colaboratory"></a>


# Understanding Data Types in Python

Effective data-driven science and computation requires understanding how data is stored and manipulated.
<span class="mark">This section outlines and contrasts how arrays of data are handled in the Python language itself, and how NumPy improves on this</span>.
Understanding this difference is fundamental to understanding much of the material throughout the rest of the book.

Users of Python are often drawn-in by its ease of use, one piece of which is dynamic typing.
While a statically-typed language like C or Java requires each variable to be explicitly declared, a dynamically-typed language like Python skips this specification. For example, in C you might specify a particular operation as follows:

```C
/* C code */
int result = 0;
for(int i=0; i<100; i++){
    result += i;
}
```

While in Python the equivalent operation could be written this way:

```python
# Python code
result = 0
for i in range(100):
    result += i
```

Notice <span class="girk">the main difference: in C, the data types of each variable are explicitly declared, while in Python the types are dynamically inferred.</span> This means, for example, that we can assign any kind of data to any variable:

```python
# Python code
x = 4
x = "four"
```

Here we've switched the contents of ``x`` from an integer to a string. The same thing in C would lead (depending on compiler settings) to a compilation error or other unintented consequences:

```C
/* C code */
int x = 4;
x = "four";  // FAILS
```

This sort of flexibility is one piece that makes Python and other dynamically-typed languages convenient and easy to use.
Understanding *how* this works is an important piece of learning to analyze data efficiently and effectively with Python.
But what <span class="girk">this type-flexibility also points to is the fact that Python variables are more than just their value; they also contain extra information about the type of the value.</span> We'll explore this more in the sections that follow.

## A Python Integer Is More Than Just an Integer

The standard Python implementation is written in C.
This means that <span class="burk">every Python object is simply a cleverly-disguised C structure</span>, which contains not only its value, but other information as well. For example, <span class="mark">when we define an integer in Python, such as ``x = 10000``, ``x`` is not just a "raw" integer. It's actually a pointer to a compound C structure, which contains several values</span>.
Looking through the Python 3.4 source code, we find that the integer (long) type definition effectively looks like this (once the C macros are expanded):

```C
struct _longobject {
    long ob_refcnt;
    PyTypeObject *ob_type;
    size_t ob_size;
    long <span class="mark">ob_digit[1];
};
```

A single integer in Python 3.4 actually contains four pieces:

- <span class="mark">``ob_refcnt``, a reference count that helps Python silently handle memory allocation and deallocation</span>
- <span class="mark">``ob_type``, which encodes the type of the variable</span>
- <span class="mark">``ob_size``, which specifies the size of the following data members</span>
- <span class="mark">``ob_digit``, which contains the actual integer</span> value that we expect the Python variable to represent.

This means that <span class="girk">there is some overhead in storing an integer in Python as compared to an integer in a compiled language like C</span>, as illustrated in the following figure:

![Integer Memory Layout](figures/cint_vs_pyint.png)

Here ``PyObject_HEAD`` is the part of the structure containing the reference count, type code, and other pieces mentioned before.

Notice the difference here: a C integer is essentially a label for a position in memory whose bytes encode an integer value.
A Python integer is a pointer to a position in memory containing all the Python object information, including the bytes that contain the integer value.
This extra information in the Python integer structure is what allows Python to be coded so freely and dynamically.
<span class="mark">All this additional information in Python types comes at a cost, however, which becomes especially apparent in structures that combine many of these objects.</span>

## A Python List Is More Than Just a List

Let's consider now what happens when we use a Python data structure that holds many Python objects.
The standard mutable multi-element container in Python is the list.
We can create a list of integers as follows:

In [2]:
L = list(range(10))
L

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [3]:
type(L[0])

int

Or, similarly, a list of strings:

In [4]:
L2 = [str(c) for c in L]
L2

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

In [5]:
type(L2[0])

str

Because of Python's dynamic typing, we can even create heterogeneous lists:

In [6]:
L3 = [True, "2", 3.0, 4]
[type(item) for item in L3]

[bool, str, float, int]

But this flexibility comes at a cost: <span class="mark">to allow these flexible types, each item in the list must contain its own type info, reference count, and other information–that is, each item is a complete Python object.</span>
<span class="girk">In the special case that all variables are of the same type, much of this information is redundant: it can be much more efficient to store data in a fixed-type array</span>.
The difference between <span class="girk">a dynamic-type list and a fixed-type (NumPy-style) array</span> is illustrated in the following figure:

![Array Memory Layout](figures/array_vs_list.png)

<span class="mark">At the implementation level, the array essentially contains a single pointer to one contiguous block of data</span>.
<span class="girk">The Python list, on the other hand, contains a pointer to a block of pointers, each of which in turn points to a full Python object like the Python integer</span> we saw earlier.
Again, the advantage of the list is flexibility: because each list element is a full structure containing both data and type information, the list can be filled with data of any desired type.
<span class="burk">Fixed-type NumPy-style arrays lack this flexibility, but are much more efficient for storing and manipulating data.</span>

## Fixed-Type Arrays in Python

<span class="burk">Python offers several different options for storing data in efficient, fixed-type data buffers.
The built-in ``array`` module</span> (available since Python 3.3) can be used to create dense arrays of a uniform type:

In [8]:
import array
L = list(range(10))
A = array.array('i', L)
A

array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Here ``'i'`` is a type code indicating the contents are integers.

<span class="burk">Much more useful, however, is the ``ndarray`` object of the NumPy package.
While Python's ``array`` object provides efficient storage of array-based data, NumPy adds to this efficient *operations* on that data.</span>
We will explore these operations in later sections; here we'll demonstrate several ways of creating a NumPy array.

We'll start with the standard NumPy import, under the alias ``np``:

In [2]:
import numpy as np

## Creating Arrays from Python Lists

First, we can <span class="girk">use ``np.array`` to create arrays from Python lists</span>:

In [10]:
# integer array:
np.array([1, 4, 2, 5, 3])

array([1, 4, 2, 5, 3])

Remember that <span class="mark">unlike Python lists, NumPy is constrained to arrays that all contain the same type.
If types do not match, <span class="burk">NumPy will upcast if possible (here, integers are up-cast to floating point)</span>:</span>

In [11]:
np.array([3.14, 4, 2, 3])

array([3.14, 4.  , 2.  , 3.  ])

<span class="birk">If we want to explicitly set the data type of the resulting array, we can use the ``dtype`` keyword:</span>

In [12]:
np.array([1, 2, 3, 4], dtype='float32')

array([1., 2., 3., 4.], dtype=float32)

Finally, unlike Python lists, NumPy arrays can explicitly be multi-dimensional; here's one way of initializing a multidimensional array using a list of lists:

In [13]:
# nested lists result in multi-dimensional arrays
np.array([range(i, i + 3) for i in [2, 4, 6]])

array([[2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])

The inner lists are treated as rows of the resulting two-dimensional array.

## Creating Arrays from Scratch

Especially <span class="girk">for larger arrays, it is more efficient to create arrays from scratch using routines built into NumPy</span>.
Here are several examples:

In [17]:
# Create a length-10 integer array filled with zeros
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [18]:
# Create a 3x5 floating-point array filled with ones
np.ones((3, 5), dtype=float)

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [19]:
# Create a 3x5 array filled with 3.14
np.full((3, 5), 3.14)

array([[3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14]])

In [20]:
# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
np.arange(0, 20, 2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [16]:
# Create an array of five values evenly spaced between 0 and 1
np.linspace(0, 1, 5)

array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ])

In [17]:
# Create a 3x3 array of uniformly distributed
# random values between 0 and 1
np.random.random((3, 3))

array([[ 0.99844933,  0.52183819,  0.22421193],
       [ 0.08007488,  0.45429293,  0.20941444],
       [ 0.14360941,  0.96910973,  0.946117  ]])

In [18]:
# Create a 3x3 array of normally distributed random values
# with mean 0 and standard deviation 1
np.random.normal(0, 1, (3, 3))

array([[ 1.51772646,  0.39614948, -0.10634696],
       [ 0.25671348,  0.00732722,  0.37783601],
       [ 0.68446945,  0.15926039, -0.70744073]])

In [19]:
# Create a 3x3 array of random integers in the interval [0, 10)
np.random.randint(0, 10, (3, 3))

array([[2, 3, 4],
       [5, 7, 8],
       [0, 5, 0]])

In [20]:
# Create a 3x3 identity matrix
np.eye(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [21]:
# Create an uninitialized array of three integers
# The values will be whatever happens to already exist at that memory location
np.empty(3)

array([ 1.,  1.,  1.])

## NumPy Standard Data Types

NumPy arrays contain values of a single type, so it is important to have detailed knowledge of those types and their limitations.
Because <span class="birk">NumPy is built in C, the types will be familiar to users of C</span>, Fortran, and other related languages.

The standard NumPy data types are listed in the following table.
Note that when constructing an array, they can be specified using a string:

```python
np.zeros(10, dtype='int16')
```

Or using the associated NumPy object:

```python
np.zeros(10, dtype=np.int16)
```

| Data type	    | Description |
|---------------|-------------|
| ``bool_``     | Boolean (True or False) stored as a byte |
| ``int_``      | Default integer type (same as C ``long``; normally either ``int64`` or ``int32``)| 
| ``intc``      | Identical to C ``int`` (normally ``int32`` or ``int64``)| 
| ``intp``      | Integer used for indexing (same as C ``ssize_t``; normally either ``int32`` or ``int64``)| 
| ``int8``      | Byte (-128 to 127)| 
| ``int16``     | Integer (-32768 to 32767)|
| ``int32``     | Integer (-2147483648 to 2147483647)|
| ``int64``     | Integer (-9223372036854775808 to 9223372036854775807)| 
| ``uint8``     | Unsigned integer (0 to 255)| 
| ``uint16``    | Unsigned integer (0 to 65535)| 
| ``uint32``    | Unsigned integer (0 to 4294967295)| 
| ``uint64``    | Unsigned integer (0 to 18446744073709551615)| 
| ``float_``    | Shorthand for ``float64``.| 
| ``float16``   | Half precision float: sign bit, 5 bits exponent, 10 bits mantissa| 
| ``float32``   | Single precision float: sign bit, 8 bits exponent, 23 bits mantissa| 
| ``float64``   | Double precision float: sign bit, 11 bits exponent, 52 bits mantissa| 
| ``complex_``  | Shorthand for ``complex128``.| 
| ``complex64`` | Complex number, represented by two 32-bit floats| 
| ``complex128``| Complex number, represented by two 64-bit floats| 

<span class="pirk">More advanced type specification is possible, such as specifying big or little endian numbers</span>; for more information, refer to the [NumPy documentation](http://numpy.org/).
<span class="mark">NumPy also supports compound data types, which will be covered in [Structured Data: NumPy's Structured Arrays](02.09-Structured-Data-NumPy.ipynb).</span>

 |  Routine	     |  Description  | 
 | --------------- | ------------- | 
 | empty(shape[, dtype, order]) | Return a new array of given shape and type, without initializing entries. | 
 | empty_like(prototype[, dtype, order, subok]) | Return a new array with the same shape and type as a given array. | 
 | eye(N[, M, k, dtype, order]) | Return a 2-D array with ones on the diagonal and zeros elsewhere. | 
 | identity(n[, dtype]) | Return the identity array. | 
 | ones(shape[, dtype, order]) | Return a new array of given shape and type, filled with ones. | 
 | ones_like(a[, dtype, order, subok]) | Return an array of ones with the same shape and type as a given array. | 
 | zeros(shape[, dtype, order]) | Return a new array of given shape and type, filled with zeros. | 
 | zeros_like(a[, dtype, order, subok]) | Return an array of zeros with the same shape and type as a given array. | 
 | full(shape, fill_value[, dtype, order]) | Return a new array of given shape and type, filled with fill_value. |  
 | full_like(a, fill_value[, dtype, order, subok]) | Return a full array with the same shape and type as a given array. | 


#### numpy.empty


-- Return a new array of given shape and type, without initializing entries.

```python
numpy.empty(shape, dtype=float, order='C')
```
> *Parameters*:	
**shape** : int or tuple of int  
-- Shape of the empty array, e.g., (2, 3) or 2.   
**dtype** : data-type, optional  
-- Desired output data-type for the array, e.g, numpy.int8. Default is numpy.float64.    
**order** : {‘C’, ‘F’}, optional, default: ‘C’    
-- Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.    

> **Returns**:	
**out** : ndarray   
-- Array of uninitialized (arbitrary) data of the given shape, dtype, and order. Object arrays will be initialized to None.

Notes

- <span class="burk">[`empty`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.empty.html#numpy.empty "numpy.empty"), unlike [`zeros`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html#numpy.zeros "numpy.zeros"), does not set the array values to zero,
and may therefore be marginally faster. </span>   
- On the other hand, it requires the user to manually set all the values in the array, and should be used with caution.

In [4]:
%timeit np.zeros(100000000)
%timeit np.empty(100000000)

21.3 ms ± 636 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
20.8 ms ± 519 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [34]:
%timeit np.zeros(100000000)
%timeit np.full(100000000, 0)
%timeit a = np.empty(100000000)

13 ms ± 396 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
210 ms ± 3.04 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
12.6 ms ± 208 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


#### numpy.empty_like
-- Return a new array with the same shape and type as a given array.

```python
numpy.empty_like(prototype, dtype=None, order='K', subok=True)
```
> *Parameters*:	
**prototype** : array_like   
-- The shape and data-type of prototype define these same attributes of the returned array.   
**dtype** : data-type, optional    
-- Overrides the data type of the result. New in version 1.6.0.   
**order** :{‘C’, ‘F’, ‘A’, or ‘K’}, optional   
-- Overrides the memory layout of the result. ‘C’ means C-order, ‘F’ means F-order, ‘A’ means ‘F’ if prototype is Fortran contiguous, ‘C’ otherwise. ‘K’ means match the layout of prototype as closely as possible.   
**subok** : bool, optional.      
-- If True, then the newly created array will use the sub-class type of ‘a’, otherwise it will be a base-class array. Defaults to True.

> **Returns**:	
**out** : ndarray   
-- Array of uninitialized (arbitrary) data of the given shape, dtype, and order. Object arrays will be initialized to None.

Notes

- This function does not initialize the returned array; to do that use zeros_like or ones_like instead. It may be marginally faster than the functions that do set the array values.

In [5]:
a = ([1,2,3], [4,5,6])
np.empty_like(a)

array([[  69795840,      53362,   13726208],
       [ 225574912, 1912602625,        570]])

In [8]:
a = np.array([[1., 2., 3.],[4.,5.,6.]])
np.empty_like(a)

array([[1., 2., 3.],
       [4., 5., 6.]])

### What is the difference between contiguous and non-contiguous arrays?

https://stackoverflow.com/questions/26998223/what-is-the-difference-between-contiguous-and-non-contiguous-arrays

Lets first unpack what `K` `A` `C` and `F` stand for first. I am referring to the implementation details section of [this](https://github.com/numpy/numpy/blob/9b84c1174125cb32a6be1bb6151782f8b2beda55/doc/neps/nep-0010-new-iterator-ufunc.rst). 

* `C` Is Contiguous layout. Mathematically speaking, row major.
* `F` Is Fortran contiguous layout. Mathematically speaking, column major.
* `A` Is any order. Generally don't use this.
* `K` Is keep order. Generally don't use this.

A contiguous array is just an array stored in an unbroken block of memory: to access the next value in the array, we just move to the next memory address.

Consider the 2D array `arr = np.arange(12).reshape(3,4)`. It looks like this:

[![enter image description here](https://i.stack.imgur.com/BJIVL.png)](https://i.stack.imgur.com/BJIVL.png)

In the computer's memory, the values of `arr` are stored like this:

[![enter image description here](https://i.stack.imgur.com/MXrA6.png)](https://i.stack.imgur.com/MXrA6.png)

This means `arr` is a **C contiguous** array because the _rows_ are stored as contiguous blocks of memory. The next memory address holds the next row value on that row. If we want to move down a column, we just need to jump over three blocks (e.g. to jump from 0 to 4 means we skip over 1,2 and 3).

Transposing the array with `arr.T` means that C contiguity is lost because adjacent row entries are no longer in adjacent memory addresses. However, `arr.T` is **Fortran contiguous** since the _columns_ are in contiguous blocks of memory:

[![enter image description here](https://i.stack.imgur.com/g6Nb0.png)](https://i.stack.imgur.com/g6Nb0.png)

---

Performance-wise, accessing memory addresses which are next to each other is very often faster than accessing addresses which are more "spread out" (fetching a value from RAM could entail a number of neighbouring addresses being fetched and cached for the CPU.) This means that operations over contiguous arrays will often be quicker.

As a consequence of C contiguous memory layout, row-wise operations are usually faster than column-wise operations. For example, you'll typically find that 

```
np.sum(arr, axis=1) # sum the rows
```

is slightly faster than:

```
np.sum(arr, axis=0) # sum the columns
```

Similarly, operations on columns will be slightly faster for Fortran contiguous arrays.

---

Finally, why can't we flatten the Fortran contiguous array by assigning a new shape?

```
>>> arr2 = arr.T
>>> arr2.shape = 12
AttributeError: incompatible shape for a non-contiguous array
```

In order for this to be possible NumPy would have to put the rows of `arr.T` together like this:

[![enter image description here](https://i.stack.imgur.com/GhErW.png)](https://i.stack.imgur.com/GhErW.png)

(Setting the `shape` attribute directly assumes C order - i.e. NumPy tries to perform the operation row-wise.)

This is impossible to do. For any axis, NumPy needs to have a _constant_ stride length (the number of bytes to move) to get to the next element of the array. Flattening `arr.T` in this way would require skipping forwards and backwards in memory to retrieve consecutive values of the array.

If we wrote `arr2.reshape(12)` instead, NumPy would copy the values of arr2 into a new block of memory (since it can't return a view on to the original data for this shape).

#### numpy.eye

-- Return a 2-D array with ones on the diagonal and zeros elsewhere.

```python
numpy.eye(N, M=None, k=0, dtype=<class 'float'>, order='C')
```

<style>
   /*! CSS Used from: https://docs.scipy.org/doc/numpy/_static/css/spc-bootstrap.css */
@media print{
*{text-shadow:none!important;color:#000!important;background:transparent!important;box-shadow:none!important;}
tr{page-break-inside:avoid;}
p{orphans:3;widows:3;}
}
p{margin:0 0 9.5px;}
strong{font-weight:bold;}
em{font-style:italic;}
dl{margin-bottom:19px;}
dt,dd{line-height:19px;}
dt{font-weight:bold;}
dd{margin-left:9.5px;}
/*! CSS Used from: https://docs.scipy.org/doc/numpy/_static/scipy.css */
/*! @import https://docs.scipy.org/doc/numpy/_static/basic.css */
div.body p,div.body dd{-moz-hyphens:auto;-ms-hyphens:auto;-webkit-hyphens:auto;hyphens:auto;}
div.body td{text-align:left;}
.first{margin-top:0!important;}
table.docutils td,table.docutils th{padding:1px 8px 1px 5px;border-top:0;border-left:0;border-right:0;border-bottom:1px solid #aaa;}
th{text-align:left;padding-right:5px;}
table.field-list td,table.field-list th{border:0!important;}
.field-list p{margin:0;}
.field-name{-moz-hyphens:manual;-ms-hyphens:manual;-webkit-hyphens:manual;hyphens:manual;}
dl{margin-bottom:15px;}
dd p{margin-top:0px;}
dd{margin-top:3px;margin-bottom:10px;margin-left:30px;}
dt:target{background-color:#fbe54e;}
.versionmodified{font-style:italic;}
.classifier{font-style:oblique;}
/*! end @import */
table.field-list th.field-name{display:inline-block;padding:1px 8px 1px 5px;white-space:nowrap;background-color:rgb(238, 238, 238);}
table.field-list td.field-body{border-left:none!important;}
table.docutils td,table.docutils th{border:none;}

    </style>
<table class="docutils field-list" frame="void" rules="none">
<colgroup><col class="field-name">
<col class="field-body">
</colgroup><tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><dl class="first docutils">
<dt><strong>N</strong> <span class="classifier-delimiter">:</span> <span class="classifier">int</span></dt>
<dd><p class="first last">Number of rows in the output.</p>
</dd>
<dt><strong>M</strong> <span class="classifier-delimiter">:</span> <span class="classifier">int, optional</span></dt>
<dd><p class="first last">Number of columns in the output. If None, defaults to <em class="xref py py-obj">N</em>.</p>
</dd>
<dt><strong>k</strong> <span class="classifier-delimiter">:</span> <span class="classifier">int, optional</span></dt>
<dd><p class="first last">Index of the diagonal: 0 (the default) refers to the main diagonal,
a positive value refers to an upper diagonal, and a negative value
to a lower diagonal.</p>
</dd>
<dt><strong>dtype</strong> <span class="classifier-delimiter">:</span> <span class="classifier">data-type, optional</span></dt>
<dd><p class="first last">Data-type of the returned array.</p>
</dd>
<dt><strong>order</strong> <span class="classifier-delimiter">:</span> <span class="classifier">{‘C’, ‘F’}, optional</span></dt>
<dd><p class="first">Whether the output should be stored in row-major (C-style) or
column-major (Fortran-style) order in memory.</p>
<div class="last versionadded">
<p><span class="versionmodified">New in version 1.14.0.</span></p>
</div>
</dd>
</dl>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><dl class="first last docutils">
<dt><strong>I</strong> <span class="classifier-delimiter">:</span> <span class="classifier">ndarray of shape (N,M)</span></dt>
<dd><p class="first last">An array where all elements are equal to zero, except for the <em class="xref py py-obj">k</em>-th
diagonal, whose values are equal to one.</p>
</dd>
</dl>
</td>
</tr>
</tbody>
</table>

In [10]:
np.eye(2, dtype=float)

array([[1., 0.],
       [0., 1.]])

In [25]:
np.eye(N=5, M=5, k=1, dtype=int)

array([[0, 1, 0, 0, 0],
       [0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0]])

#### numpy.identity

-- Return the identity array.   
-- The identity array is a square array with ones on the main diagonal.

<table class="docutils field-list" frame="void" rules="none">
<colgroup><col class="field-name">
<col class="field-body">
</colgroup><tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><dl class="first docutils">
<dt><strong>n</strong> <span class="classifier-delimiter">:</span> <span class="classifier">int</span></dt>
<dd><p class="first last">Number of rows (and columns) in <em class="xref py py-obj">n</em> x <em class="xref py py-obj">n</em> output.</p>
</dd>
<dt><strong>dtype</strong> <span class="classifier-delimiter">:</span> <span class="classifier">data-type, optional</span></dt>
<dd><p class="first last">Data-type of the output.  Defaults to <code class="docutils literal notranslate"><span class="pre">float</span></code>.</p>
</dd>
</dl>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><dl class="first last docutils">
<dt><strong>out</strong> <span class="classifier-delimiter">:</span> <span class="classifier">ndarray</span></dt>
<dd><p class="first last"><em class="xref py py-obj">n</em> x <em class="xref py py-obj">n</em> array with its main diagonal set to one,
and all other elements 0.</p>
</dd>
</dl>
</td>
</tr>
</tbody>
</table>

In [26]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

#### numpy.ones

-- Return a new array of given shape and type, filled with ones.

```python
numpy.ones(shape, dtype=None, order='C')
```




<table class="docutils field-list" frame="void" rules="none">
<colgroup><col class="field-name">
<col class="field-body">
</colgroup><tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><dl class="first docutils">
<dt><strong>shape</strong> <span class="classifier-delimiter">:</span> <span class="classifier">int or sequence of ints</span></dt>
<dd><p class="first last">Shape of the new array, e.g., <code class="docutils literal notranslate"><span class="pre">(2,</span> <span class="pre">3)</span></code> or <code class="docutils literal notranslate"><span class="pre">2</span></code>.</p>
</dd>
<dt><strong>dtype</strong> <span class="classifier-delimiter">:</span> <span class="classifier">data-type, optional</span></dt>
<dd><p class="first last">The desired data-type for the array, e.g., <code class="xref py py-obj docutils literal notranslate"><span class="pre">numpy.int8</span></code>.  Default is
<code class="xref py py-obj docutils literal notranslate"><span class="pre">numpy.float64</span></code>.</p>
</dd>
<dt><strong>order</strong> <span class="classifier-delimiter">:</span> <span class="classifier">{‘C’, ‘F’}, optional, default: C</span></dt>
<dd><p class="first last">Whether to store multi-dimensional data in row-major
(C-style) or column-major (Fortran-style) order in
memory.</p>
</dd>
</dl>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><dl class="first last docutils">
<dt><strong>out</strong> <span class="classifier-delimiter">:</span> <span class="classifier">ndarray</span></dt>
<dd><p class="first last">Array of ones with the given shape, dtype, and order.</p>
</dd>
</dl>
</td>
</tr>
</tbody>
</table>

In [31]:
np.ones((2,2), dtype=int)

array([[1, 1],
       [1, 1]])

#### numpy.ones_like

-- Return an array of ones with the same shape and type as a given array.

```python
numpy.ones_like(a, dtype=None, order='K', subok=True)
```

<table class="docutils field-list" frame="void" rules="none">
<colgroup><col class="field-name">
<col class="field-body">
</colgroup><tbody valign="top">
<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><dl class="first docutils">
<dt><strong>a</strong> <span class="classifier-delimiter">:</span> <span class="classifier">array_like</span></dt>
<dd><p class="first last">The shape and data-type of <em class="xref py py-obj">a</em> define these same attributes of
the returned array.</p>
</dd>
<dt><strong>dtype</strong> <span class="classifier-delimiter">:</span> <span class="classifier">data-type, optional</span></dt>
<dd><p class="first">Overrides the data type of the result.</p>
<div class="last versionadded">
<p><span class="versionmodified">New in version 1.6.0.</span></p>
</div>
</dd>
<dt><strong>order</strong> <span class="classifier-delimiter">:</span> <span class="classifier">{‘C’, ‘F’, ‘A’, or ‘K’}, optional</span></dt>
<dd><p class="first">Overrides the memory layout of the result. ‘C’ means C-order,
‘F’ means F-order, ‘A’ means ‘F’ if <em class="xref py py-obj">a</em> is Fortran contiguous,
‘C’ otherwise. ‘K’ means match the layout of <em class="xref py py-obj">a</em> as closely
as possible.</p>
<div class="last versionadded">
<p><span class="versionmodified">New in version 1.6.0.</span></p>
</div>
</dd>
<dt><strong>subok</strong> <span class="classifier-delimiter">:</span> <span class="classifier">bool, optional.</span></dt>
<dd><p class="first last">If True, then the newly created array will use the sub-class
type of ‘a’, otherwise it will be a base-class array. Defaults
to True.</p>
</dd>
</dl>
</td>
</tr>
<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><dl class="first last docutils">
<dt><strong>out</strong> <span class="classifier-delimiter">:</span> <span class="classifier">ndarray</span></dt>
<dd><p class="first last">Array of ones with the same shape and type as <em class="xref py py-obj">a</em>.</p>
</dd>
</dl>
</td>
</tr>
</tbody>
</table>

In [36]:
x = np.arange(6)
x = x.reshape((2, 3))
print( 'x:',x)
y = np.ones_like(x)
print( 'y:',y)

x: [[0 1 2]
 [3 4 5]]
y: [[1 1 1]
 [1 1 1]]
