# Pandas Resources

## Dataframe initializers

|Initializer|Description|
|----|----|
|2D ndarray|A matrix of data, passing optional row and column labels|
|dict of arrays, lists, or tuples|Each sequence becomes a column in the DataFrame. All sequences must be the same length.|
|NumPy structured/record array|Treated as the “dict of arrays” case|
|dict of Series|Each value becomes a column. Indexes from each Series are union-ed together to form the result’s row index if no explicit index is passed.|
|dict of dicts|Each inner dict becomes a column. Keys are union-ed to form the row index as in the “dict of Series” case.|
|list of dicts or Series|Each item becomes a row in the DataFrame. Union of dict keys or Series indexes become the DataFrame’s column labels|
|List of lists or tuples|Treated as the “2D ndarray” case|
|Another DataFrame|The DataFrame’s indexes are used unless different ones are passed|
|NumPy MaskedArray|Like the “2D ndarray” case except masked values become NA/missing in the DataFrame result|


## Pandas I/O functions


|Format|Input function|Output function|
|----|----|----|
|CSV|read_csv()|to_csv()|
|Delimited file (generic)|read_table()|to_csv()|
|Excel worksheet|read_excel()|to_excel()|
|Fixed-width fields|read_fwf()||
|Google BigQuery|read_gbq()|to_gbq()|
|HDF5|read_hdf()|to_hdf()|
|HTML table|read_html()|to_html()|
|JSON|read_json()|to_json()|
|OS clipboard data|read_clipboard()|to_clipboard()|
|Parquet|read_parquet()|to_parquet()|
|pickle|read_pickle()|to_pickle()|
|SAS|read_sas()||
|SQL query|read_sql()|to_sql()|

<div class="alert alert-block alert-info">
<b>NOTE:</b> All `read_...()` functions return a new DataFrame, except `read_html()`, which returns a list of DataFrames.
</div>


## Methods for Computations

|Method|Returns|
|----|----|
|abs()|absolute values|
|corr()|pairwise correlations|
|count()|number of values|
|cov()|Pairwise covariance|
|cumsum()|cumulative sums|
|cumprod()|cumulative products|
|cummin(), cummax()|cumulative minimum, maximum|
|kurt()|unbiased kurtosis|
|median()|median|
|min(), max()|minimum, maximum values|
|prod()|products|
|quantile()|values at given quantile|
|skew()|unbiased skewness|
|std()|standard deviation|
|var()|variance|

> TODO: which are called from Series, and which are called from DataFrames

## Data description and summaries
|Method|Description|
|----|----|
|DF.columns()|Get or set column labels|
|DF.rows()|Get or set row labels|
|DF.shape()<br/>|S.shape()|Get or set shape (length of each axis)|
|DF.head(n)<br/>DF.tail(n)|Return n items (default 5) from beginning or end|
|DF.describe()<br/>S.describe()|Display statistics for dataframe|
|DF.info()|Display column attributes|
|DF.values<br/>S.values|Get the actual values from a data structure|

<div class="alert alert-block alert-info">
<b>NOTE:</b> These methods return Series or DataFrames, as appropriate, and can be computed over rows (axis=0) or columns (axis=1). They generally skip NA/null values.
</div>

## Accessories for selecting rows and columns
|Accessor|Description|
|----|----|
|DF.loc[row_indexer, col_indexer]|Multi-axis indexing by label (not by position).<sup>1,3,4</sup>|
|DF.iloc[row_indexer, col_indexer]|Multi-axis indexing by position (not by labels).<sup>2,3,4</sup>|

<sup>1</sup> Indexers are label, slice of labels, iterable of labels, or Boolean expression

<sup>2</sup> Indexers are numeric position (0-based), slice of positions, iterable of positions, or Boolean expression

<sup>3</sup> If only row indexer is supplied, all columns are selected

<sup>4</sup> `:` selects all rows
