# Introduction to Numpy

In this notebook, outlines techniques for effectively loading, storing, and manipulating in-memory data in Python. The topic is very broad: datasets can come from a wide range of sources and a wide range of formats, including be collections of documents, collections of images, collections of sound clips, collections of numerical measurements, or nearly anything else. Despite this apparent heterogeneity, it will help us to think of all data fundamentally as arrays of numbers.

For example, images–particularly digital images–can be thought of as simply two-dimensional arrays of numbers representing pixel brightness across the area.
Sound clips can be thought of as one-dimensional arrays of intensity versus time.
Text can be converted in various ways into numerical representations, perhaps binary digits representing the frequency of certain words or pairs of words.
No matter what the data are, the first step in making it analyzable will be to transform them into arrays of numbers.

For this reason, efficient storage and manipulation of numerical arrays is absolutely fundamental to the process of doing data science.
We'll now take a look at the specialized tools that Python has for handling such numerical arrays: the NumPy package, and the Pandas package.

This lesson will cover NumPy in detail. NumPy (short for *Numerical Python*) provides an efficient interface to store and operate on dense data buffers.
In some ways, NumPy arrays are like Python's built-in ``list`` type, but NumPy arrays provide much more efficient storage and data operations as the arrays grow larger in size.
NumPy arrays form the core of nearly the entire ecosystem of data science tools in Python, so time spent learning to use NumPy effectively will be valuable no matter what aspect of data science interests you.


## Getting Started

As the majority of Python libraries, in order to use NumPy we need to import it. To check if we have it installed on our system, we can just run the following commands:

In [2]:
import numpy
numpy.__version__

'1.26.4'

In [5]:
numpy 

<module 'numpy' from '/home/mattpower/anaconda3/lib/python3.12/site-packages/numpy/__init__.py'>

To check all the contents inside the NumPy library, you can execute the following command:

numpy.<TAB>

Where <TAB> stands for the tab key on your keyboard. To access to NumPy documentation, you can just type

In [6]:
numpy?

[0;31mType:[0m        module
[0;31mString form:[0m <module 'numpy' from '/home/mattpower/anaconda3/lib/python3.12/site-packages/numpy/__init__.py'>
[0;31mFile:[0m        ~/anaconda3/lib/python3.12/site-packages/numpy/__init__.py
[0;31mDocstring:[0m  
NumPy
=====

Provides
  1. An array object of arbitrary homogeneous items
  2. Fast mathematical operations over arrays
  3. Linear Algebra, Fourier Transforms, Random Number Generation

How to use the documentation
----------------------------
Documentation is available in two forms: docstrings provided
with the code, and a loose standing reference guide, available from
`the NumPy homepage <https://numpy.org>`_.

We recommend exploring the docstrings using
`IPython <https://ipython.org>`_, an advanced Python shell with
TAB-completion and introspection capabilities.  See below for further
instructions.

The docstring examples assume that `numpy` has been imported as ``np``::

  >>> import numpy as np

Code snippets are indicated by

Credits: Jake VanderPlas (Python Data Science Handbook)