# Introduction to Numpy

Numpy is the fundamental package for scientific computing with Python

- **Powerful:** N-dimensional arrays
Fast and versatile, the NumPy vectorization, indexing, and broadcasting concepts are the de-facto standards of array computing today.
- **Numerical computing tools:** NumPy offers comprehensive mathematical functions, random number generators, linear algebra routines, Fourier transforms, and more.
- **Open source:** Distributed under a liberal BSD license, NumPy is developed and maintained publicly on GitHub by a vibrant, responsive, and diverse community.
- **Interoperable:** NumPy supports a wide range of hardware and computing platforms, and plays well with distributed, GPU, and sparse array libraries.
- **Performant:** The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code.
- **Easy to use:** NumPy’s high level syntax makes it accessible and productive for programmers from any background or experience level.

In [2]:
import numpy as np
print(np.__version__)

1.24.3


# Ecosystem

## Scientific Domains


Nearly every scientist working in Python draws on the power of NumPy.

NumPy brings the computational power of languages like C and Fortran to Python, a language much easier to learn and use. With this power comes simplicity: a solution in NumPy is often clear and elegant.

### A Comprehensive Overview of Python Libraries for Scientific Computing


**Quantum Computing:**

* **QuTiP:** A Python framework for quantum computing simulation. [https://qutip.org/qutip-tutorials/index-v4](https://qutip.org/qutip-tutorials/index-v4)
* **PyQuil:** A Python-based quantum programming language. [https://github.com/rigetti/pyquil](https://github.com/rigetti/pyquil)
* **Qiskit:** An open-source quantum computing software development kit. [https://www.ibm.com/quantum/qiskit](https://www.ibm.com/quantum/qiskit)
* **PennyLane:** A Python library for hybrid quantum-classical machine learning. [https://pennylane.ai/index.html](https://pennylane.ai/index.html)

**Statistical Computing:**

* **Pandas:** A data analysis and manipulation library. [https://pandas.pydata.org/](https://pandas.pydata.org/)
* **statsmodels:** A Python module that allows users to perform statistical tests and model data. [https://www.statsmodels.org/](https://www.statsmodels.org/)
* **Xarray:** A N-D labeled arrays and datasets in Python. [https://xarray.pydata.org/en/v0.7.1/](https://xarray.pydata.org/en/v0.7.1/)
* **Seaborn:** A Python data visualization library based on matplotlib. [https://seaborn.pydata.org/](https://seaborn.pydata.org/)

**Signal Processing:**

* **SciPy:** A collection of scientific and technical computing tools for Python. [https://lectures.scientific-python.org/](https://lectures.scientific-python.org/)
* **PyWavelets:** A Python toolbox for wavelet transforms. [https://pywavelets.readthedocs.io/](https://pywavelets.readthedocs.io/)
* **python-control:** A Python package for control systems analysis and design. [https://python-control.readthedocs.io/](https://python-control.readthedocs.io/)
* **HyperSpy:** A Python-based hyperspectral imaging analysis library. [https://hyperspy.org/](https://hyperspy.org/)

**Image Processing:**

* **Scikit-image:** A collection of algorithms for image processing in Python. [https://scikit-image.org/](https://scikit-image.org/)
* **OpenCV:** An open-source computer vision library. [https://opencv.org/](https://opencv.org/)
* **Mahotas:** A computer vision and image processing library for Python. [https://mahotas.readthedocs.io/](https://mahotas.readthedocs.io/)

**Graphs and Networks:**

* **NetworkX:** A Python package for creating and analyzing graphs and networks. [https://networkx.org/](https://networkx.org/)
* **graph-tool:** A fast and flexible graph library for Python. [https://forum.skewed.de/c/graph-tool/5](https://forum.skewed.de/c/graph-tool/5)
* **igraph:** A Python library for network analysis and visualization. [https://igraph.org/](https://igraph.org/)
* **PyGSP:** A Python package for graph signal processing. [https://github.com/epfl-lts2/pygsp](https://github.com/epfl-lts2/pygsp)

**Astronomy:**

* **AstroPy:** A Python library for astronomy and astrophysics. [https://www.astropy.org/](https://www.astropy.org/)
* **SunPy:** A Python library for solar physics. [https://sunpy.org/about/](https://sunpy.org/about/)
* **SpacePy:** A Python library for space science and plasma physics. [https://spacepy.github.io/](https://spacepy.github.io/)

**Cognitive Psychology:**

* **PsychoPy:** A Python library for experimental psychology. [https://www.psychopy.org/](https://www.psychopy.org/)

**Bioinformatics:**

* **BioPython:** A Python package for computational biology. [https://biopython.org/](https://biopython.org/)
* **Scikit-Bio:** A collection of algorithms for machine learning in biology. [https://scikit.bio/](https://scikit.bio/)
* **PyEnsembl:** A Python interface to the Ensembl genome database. [https://github.com/openvax/pyensembl](https://github.com/openvax/pyensembl)
* **ETE:** A Python toolkit for evolutionary biology. [http://etetoolkit.org/](http://etetoolkit.org/)

**Bayesian Inference:**

* **PyStan:** A Python interface to Stan, a probabilistic programming language. [https://pystan.readthedocs.io/](https://pystan.readthedocs.io/)
* **PyMC3:** A Python package for Bayesian modeling and inference. [https://arxiv.org/abs/1507.08050](https://arxiv.org/abs/1507.08050)
* **ArviZ:** A Python library for Bayesian model analysis. [User Interface for Bayesian Analysis ON ArviZ arviz-devs.github.io]
* **emcee:** A Python package for Markov chain Monte Carlo sampling. [https://emcee.readthedocs.io/](https://emcee.readthedocs.io/)

**Mathematical Analysis:**

* **SciPy:** A collection of scientific and technical computing tools for Python. [https://lectures.scientific-python.org/](https://lectures.scientific-python.org/)
* **SymPy:** A Python library for symbolic mathematics. [https://www.sympy.org/](https://www.sympy.org/)
* **cvxpy:** A Python-embedded modeling language for convex optimization. [https://www.cvxpy.org/](https://www.cvxpy.org/)
* **FEniCS:** A Python-based finite element solver. [https://fenicsproject.org/](https://fenicsproject.org/)

**Chemistry:**

* **Cantera:** A Python library for chemical kinetics, thermodynamics, and transport processes. [https://cantera.org/documentation/docs-2.5/sphinx/html/cython/kinetics.html](https://cantera.org/documentation/docs-2.5/sphinx/html/cython/kinetics.html)
* **MDAnalysis:** A Python package for molecular dynamics analysis. [https://www.mdanalysis.org/](https://www.mdanalysis.org/)
* **RDKit:** A cheminformatics toolkit for Python. [https://www.rdkit.org/](https://www.rdkit.org/)
* **PyBaMM:** A Python battery model framework. [https://pybamm.org/](https://pybamm.org/)

**Geoscience:**

* **Pangeo:** A project for building a collaborative, cloud-based data analysis platform for Earth sciences. [https://www.pangeo.io/](https://www.pangeo.io/)
* **Simpeg:** A Python package for geophysical modeling and inversion. [https://simpeg.xyz/](https://simpeg.xyz/)
* **ObsPy:** A Python package for seismology and geophysics. [https://docs.obspy.org/](https://docs.obspy.org/)
* **Fatiando a Terra:** A Python package for geophysical data analysis and modeling. [https://www.fatiando.org/](https://www.fatiando.org/)

**Geographic Processing:**

* **Shapely:** A Python library for geometric operations on planar geometries. [https://shapely.readthedocs.io/](https://shapely.readthedocs.io/)
* **GeoPandas:** A Python library for geospatial data processing. [https://geopandas.org/](https://geopandas.org/)
* **Folium:** A Python library for creating interactive maps. [https://realpython.com/python-folium-web-maps-from-data/](https://realpython.com/python-folium-web-maps-from-data/)

**Architecture & Engineering:**

* **COMPAS:** A Python framework for computational design and fabrication. [http://www.sfu.ca/~herhan/research_i.html](http://www.sfu.ca/~herhan/research_i.html)
* **City Energy Analyst:** A Python tool for analyzing energy use in cities. [https://www.cityenergyanalyst.com/](https://www.cityenergyanalyst.com/)
* **Sverchok:** A node-based visual programming environment for Blender. [https://github.com/nortikin/sverchok](https://github.com/nortikin/sverchok)


## Array Libraries

**NumPy's API** serves as a foundational building block for numerous scientific computing libraries in Python. Its versatility and efficiency have made it a cornerstone for developers working on innovative hardware, specialized array types, or advanced analytics.

### Array Libraries Built on NumPy

**Dask:**

  * **Capabilities:** Distributed arrays and advanced parallelism for analytics.
  * **Application areas:** Large-scale data processing, machine learning, and scientific computing.
  * **Link:** [https://dask.org]

**CuPy:**

  * **Capabilities:** NumPy-compatible array library for GPU-accelerated computing.
  * **Application areas:** Deep learning, scientific computing, and data analysis.
  * **Link:** [https://cupy.chainer.org/](https://www.google.com/url?sa=E&source=gmail&q=https://cupy.chainer.org/)

**JAX:**

  * **Capabilities:** Composable transformations of NumPy programs: differentiation, vectorization, just-in-time compilation to GPU/TPU.
  * **Application areas:** Machine learning, scientific computing, and optimization.
  * **Link:** [https://jax.readthedocs.io/](https://www.google.com/url?sa=E&source=gmail&q=https://jax.readthedocs.io/)

**Xarray:**

  * **Capabilities:** Labeled, indexed multi-dimensional arrays for advanced analytics and visualization.
  * **Application areas:** Earth sciences, climate modeling, and scientific data analysis.
  * **Link:** <https://xarray.pydata.org/en/v0.7.1/>

**Sparse:**

  * **Capabilities:** NumPy-compatible sparse array library that integrates with Dask and SciPy's sparse linear algebra.
  * **Application areas:** Scientific computing, machine learning, and data analysis with sparse data.
  * **Link:** [https://sparse.pydata.org/](https://www.google.com/url?sa=E&source=gmail&q=https://sparse.pydata.org/)

**Deep Learning Frameworks:**

  * **PyTorch:** A deep learning framework that accelerates the path from research prototyping to production deployment.

  * **Link:** [https://pytorch.org/](https://www.google.com/url?sa=E&source=gmail&q=https://pytorch.org/)

  * **TensorFlow:** An end-to-end platform for machine learning to easily build and deploy ML powered applications.

  * **Link:** [https://www.tensorflow.org/](https://www.google.com/url?sa=E&source=gmail&q=https://www.tensorflow.org/)

**Data Analysis and Processing:**

  * **Arrow:** A cross-language development platform for columnar in-memory data and analytics.

  * **Link:** [https://arrow.apache.org/](https://www.google.com/url?sa=E&source=gmail&q=https://arrow.apache.org/)

  * **xtensor:** Multi-dimensional arrays with broadcasting and lazy computing for numerical analysis.

  * **Link:** [https://xtensor.readthedocs.io/en/latest/](https://www.google.com/url?sa=E&source=gmail&q=https://xtensor.readthedocs.io/en/latest/)

  * **Awkward Array:** Manipulate JSON-like data with NumPy-like idioms.

  * **Link:** [https://awkward-array.org/doc/main/]

  * **uarray:** Python backend system that decouples API from implementation; unumpy provides a NumPy API.

  * **Link:** [https://uarray.readthedocs.io/en/latest/](https://www.google.com/url?sa=E&source=gmail&q=https://uarray.readthedocs.io/en/latest/)

  * **tensorly:** Tensor learning, algebra and backends to seamlessly use NumPy, PyTorch, TensorFlow or CuPy.

  * **Link:** [https://tensorly.org/](https://www.google.com/url?sa=E&source=gmail&q=https://tensorly.org/)


## Data Science


**NumPy** serves as a fundamental building block for many data science libraries, providing essential array and matrix operations. Here's a breakdown of common libraries used in typical data science workflows:

### Extract, Transform, Load (ETL)

  * **Pandas:** A powerful library for data manipulation and analysis. <https://pandas.pydata.org/>
  * **Intake:** A data ingestion library that supports various data formats. [https://intake.readthedocs.io/](https://www.google.com/url?sa=E&source=gmail&q=https://intake.readthedocs.io/)
  * **PyJanitor:** A library for data cleaning and preparation. [https://pyjanitor.readthedocs.io/](https://www.google.com/url?sa=E&source=gmail&q=https://pyjanitor.readthedocs.io/)

### Exploratory Analysis

  * **Jupyter:** An interactive computing environment that combines code, text, and visualizations. [https://jupyter.org/](https://www.google.com/url?sa=E&source=gmail&q=https://jupyter.org/)
  * **Seaborn:** A high-level data visualization library built on Matplotlib. <https://seaborn.pydata.org/>
  * **Matplotlib:** A versatile plotting library for creating static, animated, and interactive visualizations. [https://matplotlib.org/](https://www.google.com/url?sa=E&source=gmail&q=https://matplotlib.org/)
  * **Altair:** A declarative statistical visualization library. [invalid URL removed]

### Model and Evaluate

  * **scikit-learn:** A machine learning library with a wide range of algorithms for classification, regression, clustering, and more. [https://scikit-learn.org/](https://www.google.com/url?sa=E&source=gmail&q=https://scikit-learn.org/)
  * **statsmodels:** A library for statistical modeling and testing. <https://www.statsmodels.org/>
  * **PyMC3:** A probabilistic programming library for Bayesian modeling. [https://docs.pymc.io/](https://www.google.com/url?sa=E&source=gmail&q=https://docs.pymc.io/)
  * **spaCy:** A natural language processing library for tasks like text classification and named entity recognition. [https://spacy.io/](https://www.google.com/url?sa=E&source=gmail&q=https://spacy.io/)

### Report in a Dashboard

  * **Dash:** A Python framework for building web applications and dashboards. [https://dash.plotly.com/](https://www.google.com/url?sa=E&source=gmail&q=https://dash.plotly.com/)
  * **Panel:** A high-level Python library for building web applications and dashboards. [https://panel.holoviz.org/](https://www.google.com/url?sa=E&source=gmail&q=https://panel.holoviz.org/)
  * **Voila:** A tool for converting Jupyter notebooks into standalone web applications. [https://voila.readthedocs.io/](https://www.google.com/url?sa=E&source=gmail&q=https://voila.readthedocs.io/)

**This workflow provides a solid foundation for many data science tasks. However, the specific choice of libraries may vary depending on the nature of the data and the goals of the analysis.**


## Machine Learning

NumPy forms the basis of powerful machine learning libraries like scikit-learn and SciPy. As machine learning grows, so does the list of libraries built on NumPy. TensorFlow’s deep learning capabilities have broad applications — among them speech and image recognition, text-based applications, time-series analysis, and video detection. PyTorch, another deep learning library, is popular among researchers in computer vision and natural language processing.

## Visualization

NumPy is an essential component in the burgeoning Python visualization landscape, which includes Matplotlib, Seaborn, Plotly, Altair, Bokeh, Holoviz, Vispy, Napari, and PyVista, to name a few.

NumPy’s accelerated processing of large arrays allows researchers to visualize datasets far larger than native Python could handle.