# Data Analysis  

The amount of packages avialable in Python can be overwhelming. Here is a list of commonly used packages that could be particularly useful for analysis of data related to acoustics.  

## Data Manipulation & Processing  
| Package | Description |
|---------|-------------|
| [**pandas**](https://pandas.pydata.org/) | The most widely used library for tabular data manipulation and analysis. Provides DataFrame and Series objects. |
| [**numpy**](https://numpy.org/) | Essential for numerical computing, offering multi-dimensional arrays and fast mathematical operations. |
| [**xarray**](https://xarray.pydata.org/en/stable/) | Designed for multi-dimensional labeled data, commonly used in scientific computing (e.g., climate data). |


## Big Data & Distributed Computing  
| Package | Description |
|---------|-------------|
| [**dask**](https://dask.org/) | Parallel computing and out-of-core processing for large datasets. |
| [**vaex**](https://vaex.io/) | Optimized for working with large, lazy-loaded datasets efficiently. |


## Data Visualization  

| Package | Description |
|---------|-------------|
| [**matplotlib**](https://matplotlib.org/) | The foundational library for creating static, animated, and interactive plots. |
| [**seaborn**](https://seaborn.pydata.org/) | Built on top of matplotlib, provides high-level statistical visualizations with beautiful default settings. |
| [**plotly**](https://plotly.com/python/) | Interactive and web-based plotting, great for dashboards and exploratory analysis. |
| [**bokeh**](https://bokeh.org/) | Similar to Plotly, but optimized for large-scale interactive visualizations. |
| [**holoviews**](https://holoviews.org/) | Simplifies data visualization by automatically choosing the best visualization based on the data. Integrates well with Bokeh and Matplotlib. |
| [**datashader**](http://datashader.org/) | Designed for visualizing very large datasets efficiently by rasterizing millions or billions of points into meaningful visualizations. Works well with HoloViews and Bokeh. |



## Statistical Analysis  

| Package | Description |
|---------|-------------|
| [**scipy**](https://scipy.org/) | Provides scientific and technical computing tools, including statistical analysis and optimization. |
| [**statsmodels**](https://www.statsmodels.org/) | Used for statistical modeling, hypothesis testing, and econometrics. |
| [**pyMC3**](https://docs.pymc.io/) | Bayesian statistical modeling using Markov Chain Monte Carlo (MCMC) methods. |
| [**sympy**](https://www.sympy.org/) | A Python library for symbolic mathematics, including algebraic and calculus functions. |

## Geospatial Analysis  

| Package | Description |
|---------|-------------|
|[**gstlearn**](https://gstlearn.org/) | Available for Python and R, follow up to the RGeostats project, on which the [ICES Geostatistics CRR](https://ices-library.figshare.com/articles/report/Handbook_of_Geostatistics_in_R_for_Fisheries_and_Marine_Ecology/18624080?file=33403136) is based on |
| [**geopandas**](https://geopandas.org/) | Extends pandas with support for geospatial data and shapefiles. |
| [**shapely**](https://shapely.readthedocs.io/en/stable/) | Geometric operations for geospatial data. |
| [**folium**](https://python-visualization.github.io/folium/) | Interactive maps using Leaflet.js. |
| [**rasterio**](https://rasterio.readthedocs.io/en/latest/) | For reading and writing geospatial raster data (e.g., satellite images). |


## Machine Learning & Deep Learning

| Package | Description |
|---------|-------------|
| [**scikit-learn**](https://scikit-learn.org/stable/) | The go-to library for machine learning, providing a wide range of algorithms and tools. |
| [**xgboost**](https://xgboost.readthedocs.io/en/stable/) | High-performance library for gradient boosting, often used in machine learning competitions. |
| [**lightgbm**](https://lightgbm.readthedocs.io/en/latest/) | A fast and efficient gradient boosting library, particularly for large datasets. |
| [**tensorflow & keras**](https://www.tensorflow.org/) | Popular deep learning frameworks for AI-based data analysis and building neural networks. |
| [**pytorch**](https://pytorch.org/) | A powerful deep learning library, widely used in research and production for deep learning models. |
| [**h2o.ai**](https://www.h2o.ai/) | Open-source machine learning platform that allows for building, training, and deploying models at scale. |
| [**fastai**](https://www.fast.ai/) | A deep learning library built on top of PyTorch that simplifies training and fine-tuning models. |



## Image Processing & Basic Operations  
| Package | Description | Works Well with xarray |
|---------|-------------|------------------------|
| [**Pillow (PIL)**](https://pillow.readthedocs.io/en/stable/) | A comprehensive library for opening, manipulating, and saving image files in many formats. Supports basic image operations like resizing, cropping, and filtering. | No |
| [**scikit-image**](https://scikit-image.org/) | Built on top of SciPy, this library provides algorithms for image segmentation, geometric transformations, color space manipulation, and more. | Yes (can handle multi-dimensional arrays like xarray objects) |
| [**OpenCV**](https://opencv.org/) | Open-source computer vision library with extensive functionality for real-time image processing, object detection, and camera control. | No (works with numpy arrays but not directly with xarray) |
| [**imageio**](https://imageio.github.io/) | Simple API to read and write image files in various formats, supports animated images, and easy I/O operations. | No |
| [**SimpleITK**](https://simpleitk.readthedocs.io/en/master/) | Provides a simplified interface to the ITK (Insight Segmentation and Registration Toolkit) for image segmentation and registration. | Yes (can be integrated with xarray) |
