#### pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python

#### pandas is well suited for many different kinds of data:
1. __Tabular data with heterogeneously-typed__ columns, as in an SQL table or Excel spreadsheet
2. __Ordered and unordered__ (not necessarily fixed-frequency) time series data.
3. __Arbitrary matrix data__ (homogeneously typed or heterogeneous) __with row and column labels__
4. Any __other form of observational / statistical data sets__. The data actually __need not be labeled__ at all to be placed
into a pandas data structure

The two primary data structures of pandas, __Series (1-dimensional) and DataFrame (2-dimensional)__, handle the
vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. 

pandas is built on top of NumPy
and is intended to integrate well within a scientific computing environment with many other 3rd party libraries

#### Here are just a few of the things that pandas does well:

• Easy handling of __missing data (represented as NaN)__ in floating point as well as non-floating point data

• __Size mutability__: columns can be __inserted and deleted from DataFrame__ and higher dimensional objects

• Automatic and explicit __data alignment__: objects can be explicitly aligned to a set of labels, or the user can
simply ignore the labels and let Series, DataFrame, etc. automatically align the data for you in computations

• Powerful, flexible __group by functionality__ to perform split-apply-combine operations on data sets, for both aggregating and transforming data

• Make it __easy to convert__ ragged, differently-indexed data in other Python and NumPy data structures into
DataFrame objects

• Intelligent __label-based slicing, fancy indexing, and subsetting__ of large data sets

• Intuitive __merging and joining__ data sets

• Flexible __reshaping and pivoting__ of data sets

• __Hierarchical labeling__ of axes (possible to have multiple labels per tick)

• Robust IO __tools for loading data from flat files__ (CSV and delimited), Excel files, databases, and saving / loading
data from the __ultrafast HDF5 format__

• __Time series-specific functionality__: date range generation and frequency conversion, moving window statistics,
moving window linear regressions, date shifting and lagging, etc.


Many of these principles are here to address the shortcomings frequently experienced using other languages / scientific
research environments. For data scientists, working with data is typically divided into multiple stages: munging and
cleaning data, analyzing / modeling it, then organizing the results of the analysis into a form suitable for plotting or
tabular display. pandas is the ideal tool for all of these tasks.

#### Series ---  1D labeled homogeneously-typed array
#### DataFrame --- General 2D labeled, size-mutable tabular structure with potentially heterogeneously-typed column


### Why more than one data structure?

#### The best way to think about the pandas data structures is as flexible containers for lower dimensional data. For example, DataFrame is a container for Series, and Series is a container for scalars. We would like to be able to insert and remove objects from these containers in a dictionary-like fashion.


Also, we would like sensible default behaviors for the common API functions which take into account the typical
__orientation of time series and cross-sectional data sets__. __When using ndarrays__ to store 2- and 3-dimensional data, a
__burden is placed on the user to consider the orientation of the data set when writing functions; axes are considered
more or less equivalent__ (except when C- or Fortran-contiguousness matters for performance). __In pandas, the axes are
intended to lend more semantic meaning to the data; i.e., for a particular data set there is likely to be a “right” way to
orient the data.__ The goal, then, is to reduce the amount of mental effort required to code up data transformations in
downstream functions.

For example, with tabular data (DataFrame) it is more semantically helpful to think of the index (the rows) and the
columns rather than axis 0 and axis 1. Iterating through the columns of the DataFrame thus results in more readable
code:

### Mutability and copying of data
__All pandas data structures are value-mutable (the values they contain can be altered) but not always size-mutable__. The
length of a Series cannot be changed, but, for example, columns can be inserted into a DataFrame. However, the vast
majority of methods produce new objects and leave the input data untouched. In general we like to favor immutability
where sensible