#Pandas Series Introduction

##What is Pandas Series

---

**pandas Series is a one-dimensional array that is capable of storing various data types (integer, string, float, python objects, etc.). We can easily convert the list, tuple, and dictionary into Series using the Series() method. In pandas Series, the row labels of Series are called the index. The Series can have only one column. A List, NumPy Array, Dict can be turned into a pandas Series.**

##Pandas Series vs DataFrame?Pandas Series vs DataFrame?

---


- As I explained above, pandas Series is a one-dimensional labeled array of the same data type whereas DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. 

- In a DataFrame, each column of data is represented as a pandas Series.

- DataFrame column can have a name/label but, Series cannot have a column name.

- DataFrame can also be converted to Series and single or multiple Series can be converted to a DataFrame.

##Pandas.series() Constructor

---

**Below is the syntax of pandas Series Constructor, which is used to create Pandas Series objects.**


#### Pandas Series Constructor Syntax
##Pandas.series(data,index,dtype,copy)


- data: The data contains ndarray, list, constants.

- Index: The index must be unique and hashable. np.arrange(n) if no index is passed.

- dtype: dtype is also a data type.

- copy: It is used to copy the data. The data contains ndarray, list, constants.

## Create pandas Series

**pandas Series can be created in multiple ways, From array, list, dict, and from existing DataFrame.**

##1 Create Series using array

**Before creating a Series, first, we have to import the NumPy module and use array() function in the program. If the data is ndarray, then the passed index should be in the same length, if the index is not passed the default value is range(n).**

In [0]:
import pandas as pd 
import numpy as np



In [0]:
# Create Series from array

data = np.array(['python', 'php', 'java'])
series = pd.Series(data)
print(series)

0    python
1       php
2      java
dtype: object


**Yields below output. Notice that the column doesn’t have a name. And Series also adds an incremental sequence number as Index (first column) by default.**


# Output

0    python

1       php

2      java

dtype: object

**Now, let’s see how to create a pandas DataFrame with a custom Index. To do so, will use index param which takes a list of index values. Make sure the index list matches the data size.**

In [0]:
# Create pandas DataFrame with custom index
s2 = pd.Series(data=data, index=['r1', 'r2', 'r3'])
print(s2)

r1    python
r2       php
r3      java
dtype: object


##2 Create Series using Dict

---

**A Dict can be used as input. Keys from Dict are used as Index and values are used as a column.**

In [0]:
# Create a Dict from a input

data = {'Courses':'pandas', 'Fee':20000, 'Duration':'30days'}
s2 = pd.Series(data=data)
print(s2)

Courses     pandas
Fee          20000
Duration    30days
dtype: object


###Now let’s see how to ignore Index from Dict and add the Index while creating a Series with Dict.

In [0]:
# To See index from Dict and add index while creating a Series.
data = {'Courses':"pandas", 'Fees':20000, 'Daration':"30days"}
s2 = pd.Series(data.values(), index=['Courses', 'Courses_Fees', 'Courses_Duration'])
print(s2)

Courses             pandas
Courses_Fees         20000
Courses_Duration    30days
dtype: object


In [0]:

# To See index from Dict and add index while creating a Series.
data = {'Courses' :"pandas", 'Fees' : 20000, 'Duration' : "30days"}
s2 = pd.Series(data, index=['Courses','Course_Fee','Course_Duration'])
print (s2)


Courses            pandas
Course_Fee            NaN
Course_Duration       NaN
dtype: object


##3 Create Series using List

**Below is an example of creating DataFrame from List.**

In [0]:
# Creating DataFrame from List
data = ['python', 'php', 'java']
s2 = pd.Series(data=data, index=['r1', 'r2', 'r3'])
print(s2)

r1    python
r2       php
r3      java
dtype: object


##4 Create Empty Series

**Sometimes you would require to create an empty Series. you can do so by using its empty constructor.**

In [0]:
# Create empty Series

s = pd.Series()
print(s)

Series([], dtype: float64)


  s = pd.Series()


##Convert a Series into a DataFrame?

---

**To convert Series into DataFrame, you can use pandas.concat(), pandas.merge(), DataFrame.join(). Below I have explained using concat() function. For others, please refer to pandas combine two Series to DataFrame**

In [0]:
# Convert series to dataframe
courses = pd.Series(['Spark', 'PySpark', 'Hadoop'], name='courses')
fees = pd.Series([22000, 25000, 23000], name='fees')
df = pd.concat([courses, fees], axis=1)
print(df)

   courses   fees
0    Spark  22000
1  PySpark  25000
2   Hadoop  23000


##Convert pandas DataFrame to Series?

1. Single DataFrame column into a Series (from a single-column DataFrame)

2. Specific DataFrame column into a Series (from a multi-column DataFrame)

3. Single row in the DataFrame into a Series

##1 Convert a single DataFrame column into a series:


**Let’s create a DataFrame with a single column. By using DataFrame.squeeze() to convert the DataFrame into a Series:**

In [0]:
# Create DataFrame with single column

data = ['Python', 'PHP', 'Java']
df = pd.DataFrame(data, columns=['Courses'])
my_series = df.squeeze()
print(my_series)
print(type(my_series))

0    Python
1       PHP
2      Java
Name: Courses, dtype: object
<class 'pandas.core.series.Series'>


##2. Convert a specific DataFrame column into a series:


**If you have a DataFrame with multiple columns, and you’d like to convert a specific column into a series.**

In [0]:
# Create DataFrame with multiple columns

data = {'Courses': ['Spark', 'PySpark', 'Python'],
        'Duration':['30 days', '40 days', '50 days'],
        'Fee':[20000, 25000, 26000]
        }

df = pd.DataFrame(data, columns=['Courses', 'Duration', 'Fee'])
print(df)
print(type(df))

   Courses Duration    Fee
0    Spark  30 days  20000
1  PySpark  40 days  25000
2   Python  50 days  26000
<class 'pandas.core.frame.DataFrame'>


###Let’s convert the Fee column into a Series.

In [0]:
# Pandas DataFrame column to series
my_series = df['Fee'].squeeze()
print(my_series)

0    20000
1    25000
2    26000
Name: Fee, dtype: int64


##3 Convert DataFrame Row into a Series

--- 

**Above, we have seen converting DataFrame columns into Series, here, I will explain converting rows into Series.**

In [0]:
# Convert DataFrame row to series

my_series = df.iloc[2].squeeze()
print(my_series)
print(type(my_series))

Courses      Python
Duration    50 days
Fee           26000
Name: 2, dtype: object
<class 'pandas.core.series.Series'>


##Merge DataFrame and Series?

1. Construct a dataframe from the series.

2. After that merge with the dataframe.

3. Specify the data as the values, multiply them by the length, set the columns to the index and set params for left_index and set the right_index to True.



#### Syntax for merge with the DataFrame.

**df.merge(pd.DataFrame(data = [s.values] * len(s), columns = s.index), left_index=True, right_index=True)**

##Pandas Series Attributes:

<table><tbody><tr><td>T</td><td>Return the transpose, which is by definition self.</td></tr><tr><td>array</td><td>The ExtensionArray of the data backing this Series or Index.</td></tr><tr><td>at</td><td>Access a single value for a row/column label pair.</td></tr><tr><td>attrs</td><td>Dictionary of global attributes of this dataset.</td></tr><tr><td>axes</td><td>Return a list of the row axis labels.</td></tr><tr><td>dtype</td><td>Return the dtype object of the underlying data.</td></tr><tr><td>dtypes</td><td>Return the dtype object of the underlying data.</td></tr><tr><td>flags</td><td>Get the properties associated with this pandas object.</td></tr><tr><td>hasnas</td><td>Return if I have any nans; enables various perf speedups.</td></tr><tr><td>iat</td><td>Access a single value for a row/column pair by integer position.</td></tr><tr><td>iloc</td><td>Purely integer-location based indexing for selection by position.</td></tr><tr><td>index</td><td>The index (axis labels) of the Series.</td></tr><tr><td>is_monotonic</td><td>Return boolean if values in the object are monotonic_increasing.</td></tr><tr><td>is_monotonic_decreasing</td><td>Return boolean if values in the object are monotonic_decreasing.</td></tr><tr><td>is_monotonic_increasing</td><td>Alias for is_monotonic.</td></tr><tr><td>is_unique</td><td>Return boolean if values in the object are unique.</td></tr><tr><td>loc</td><td>Access a group of rows and columns by label(s) or a boolean array.</td></tr><tr><td>name</td><td>Return the name of the Series.</td></tr><tr><td>nbytes</td><td>Return the number of bytes in the underlying data.</td></tr><tr><td>ndim</td><td>Number of dimensions of the underlying data, by definition 1.</td></tr><tr><td>shape</td><td>Return a tuple of the shape of the underlying data.</td></tr><tr><td>size</td><td>Return the number of elements in the underlying data.</td></tr><tr><td>values</td><td>Return Series as ndarray or ndarray-like depending on the dtype.</td></tr></tbody></table>

##Pandas Series Methods:

<table><tbody><tr><td>abs()</td><td>Return a Series/DataFrame with absolute numeric value of each element.</td></tr><tr><td>add(other[,&nbsp;level,&nbsp;fill_value,&nbsp;axis])</td><td>Return Addition of series and other, element-wise (binary operator&nbsp;add).</td></tr><tr><td>add_prefix(prefix)</td><td>Prefix labels with string&nbsp;prefix.</td></tr><tr><td>add_suffix(suffix)</td><td>Suffix labels with string&nbsp;suffix.</td></tr><tr><td>agg([func,&nbsp;axis])</td><td>Aggregate using one or more operations over the specified axis.</td></tr><tr><td>aggregate([func,&nbsp;axis])</td><td>Aggregate using one or more operations over the specified axis.</td></tr><tr><td>align(other[,&nbsp;join,&nbsp;axis,&nbsp;level,&nbsp;copy,&nbsp;…])</td><td>Align two objects on their axes with the specified join method.</td></tr><tr><td>all([axis,&nbsp;bool_only,&nbsp;skipna,&nbsp;level])</td><td>Return whether all elements are True, potentially over an axis.</td></tr><tr><td>any([axis,&nbsp;bool_only,&nbsp;skipna,&nbsp;level])</td><td>Return whether any element is True, potentially over an axis.</td></tr><tr><td>append(to_append[,&nbsp;ignore_index,&nbsp;…])</td><td>Concatenate two or more Series.</td></tr><tr><td>apply(func[,&nbsp;convert_dtype,&nbsp;args])</td><td>Invoke function on values of Series.</td></tr><tr><td>argmax([axis,&nbsp;skipna])</td><td>Return int position of the largest value in the Series.</td></tr><tr><td>argmin([axis,&nbsp;skipna])</td><td>Return int position of the smallest value in the Series.</td></tr><tr><td>argsort([axis,&nbsp;kind,&nbsp;order])</td><td>Return the integer indices that would sort the Series values.</td></tr><tr><td>asfreq(freq[,&nbsp;method,&nbsp;how,&nbsp;normalize,&nbsp;…])</td><td>Convert time series to specified frequency.</td></tr><tr><td>asof(where[,&nbsp;subset])</td><td>Return the last row(s) without any NaNs before&nbsp;where.</td></tr><tr><td>astype(dtype[,&nbsp;copy,&nbsp;errors])</td><td>Cast a pandas object to a specified dtype&nbsp;.</td></tr><tr><td>at_time(time[,&nbsp;asof,&nbsp;axis])</td><td>Select values at particular time of day (e.g., 9:30AM).</td></tr><tr><td>autocorr([lag])</td><td>Compute the lag-N autocorrelation.</td></tr><tr><td>backfill([axis,&nbsp;inplace,&nbsp;limit,&nbsp;downcast])</td><td>Synonym for&nbsp;DataFrame.fillna()&nbsp;with&nbsp;method=”bfill”.</td></tr><tr><td>between(left,&nbsp;right[,&nbsp;inclusive])</td><td>Return boolean Series equivalent to left &lt;= series &lt;= right.</td></tr><tr><td>between_(start_time,&nbsp;end_time[,&nbsp;…])</td><td>Select values between particular times of the day (e.g., 9:00-9:30 AM).<br><br></td></tr></tbody></table>


**Continue..**


<table><tbody><tr><td>bfill([axis,&nbsp;inplace,&nbsp;limit,&nbsp;downcast])</td><td>Synonym for&nbsp;DataFrame.fillna() with method=”bfill”&nbsp;.</td></tr><tr><td>bool()</td><td>Return the bool of a single element Series or DataFrame.</td></tr><tr><td>cat</td><td>alias of&nbsp;pandas.core.arrays.categorical.categoricalAccessor</td></tr><tr><td>clip([lower,&nbsp;upper,&nbsp;axis,&nbsp;inplace])</td><td>Trim values at input threshold(s).</td></tr><tr><td>combine(other,&nbsp;func[,&nbsp;fill_value])</td><td>Combine the Series with a Series or scalar according to&nbsp;func.</td></tr><tr><td>combine_first(other)</td><td>Update null elements with value in the same location in ‘other’.</td></tr><tr><td>compare(other[,&nbsp;align_axis,&nbsp;keep_shape,&nbsp;…])</td><td>Compare to another Series and show the differences.</td></tr><tr><td>convert([infer_objects,&nbsp;…])</td><td>Convert columns to best possible dtypes using dtypes supporting&nbsp;pd.NA.</td></tr><tr><td>copy([deep])</td><td>Make a copy of this object’s indices and data.</td></tr><tr><td>corr(other[,&nbsp;method,&nbsp;min_periods])</td><td>Compute correlation with&nbsp;other&nbsp;Series, excluding missing values.</td></tr><tr><td>count([level])</td><td>Return number of non-NA/null observations in the Series.</td></tr><tr><td>cov(other[,&nbsp;min_periods,&nbsp;ddof])</td><td>Compute covariance with Series, excluding missing values.</td></tr><tr><td>cummax([axis,&nbsp;skipna])</td><td>Return cumulative maximum over a DataFrame or Series axis.</td></tr><tr><td>cummin([axis,&nbsp;skipna])</td><td>Return cumulative minimum over a DataFrame or Series axis.</td></tr><tr><td>cumprod([axis,&nbsp;skipna])</td><td>Return cumulative product over a DataFrame or Series axis.</td></tr><tr><td>cumsum([axis,&nbsp;skipna])</td><td>Return cumulative sum over a DataFrame or Series axis.</td></tr><tr><td>describe([percentiles,&nbsp;include,&nbsp;exclude,&nbsp;…])</td><td>Generate descriptive statistics.</td></tr><tr><td>diff([periods])</td><td>First discrete difference of element.</td></tr><tr><td>div(other[,&nbsp;level,&nbsp;fill_value,&nbsp;axis])</td><td>Return Floating division of series and other, element-wise (binary operator&nbsp;truediv).</td></tr><tr><td>divide(other[,&nbsp;level,&nbsp;fill_value,&nbsp;axis])</td><td>Return Floating division of series and other, element-wise (binary operator&nbsp;truediv).</td></tr><tr><td>divmod(other[,&nbsp;level,&nbsp;fill_value,&nbsp;axis])</td><td>Return Integer division and modulo of series and other, element-wise (binary operator&nbsp;divmod).</td></tr><tr><td>dot(other)</td><td>Compute the dot product between the Series and the columns of other.</td></tr><tr><td>drop([labels,&nbsp;axis,&nbsp;index,&nbsp;columns,&nbsp;level,&nbsp;…])</td><td>Return Series with specified index labels removed.</td></tr><tr><td>drop_duplicate([keep,&nbsp;inplace])</td><td>Return Series with duplicate values removed.</td></tr><tr><td>droplevel(level[,&nbsp;axis])</td><td>Return Series/DataFrame with requested index / column level(s) removed.</td></tr><tr><td>dropna([axis,&nbsp;inplace,&nbsp;how])</td><td>Return a new Series with missing values removed.</td></tr><tr><td>dt</td><td>alias of&nbsp;pandas.core.indexes.accessors.CombinedDatetimelikeproperties.</td></tr><tr><td>duplicated([keep])</td><td>Indicate duplicate Series values.</td></tr><tr><td>eq(other[,&nbsp;level,&nbsp;fill_value,&nbsp;axis])</td><td>Return Equal to of series and other, element-wise (binary operator&nbsp;eq).</td></tr><tr><td>equals(other)</td><td>Test whether two objects contain the same elements.</td></tr><tr><td>ewm([com,&nbsp;span,&nbsp;halflife,&nbsp;alpha,&nbsp;…])</td><td>Provide exponential weighted (EW) functions.</td></tr><tr><td>expanding([min_periods,&nbsp;center,&nbsp;axis,&nbsp;method])</td><td>Provide expanding transformations.</td></tr><tr><td>explode([ignore_index])</td><td>Transform each element of a list-like to a row.</td></tr><tr><td>factorize([sort,&nbsp;na_sentinel])</td><td>Encode the object as an enumerated type or categorical variable.</td></tr><tr><td>ffill([axis,&nbsp;inplace,&nbsp;limit,&nbsp;downcast])</td><td>Synonym for&nbsp;DataFrame.fillna()with&nbsp;method=ffill().</td></tr><tr><td>fillna([value,&nbsp;method,&nbsp;axis,&nbsp;inplace,&nbsp;…])</td><td>Fill NA/NaN values using the specified method.</td></tr><tr><td>filter([items,&nbsp;like,&nbsp;regex,&nbsp;axis])</td><td>Subset the dataframe rows or columns according to the specified index labels.</td></tr><tr><td>first(offset)</td><td>Select initial periods of time series data based on a date offset.</td></tr><tr><td>first_valid_other()</td><td>Return index for first non-NA value or None, if no NA value is found.</td></tr></tbody></table>