# Pandas vs SQL:
### Pandas:
- Setup is easy
- Complexity is less since it is just a package that requires being imported
- Reliability and scalability are less
- Security is compromised	
- Math, statistics, and procedural approaches like User Defined Functions (UDF) are handled efficiently
- Cannot be easily integrated with other languages and applications
- People with good technical knowledge can do data manipulation operations

### SQL:
- Setup needs tuning and optimization of the query
- Configuration and other database configurations give more complexity and time of execution
- Reliability and scalability are much better
- Security is higher due to Atomicity, Consistency, Isolation, and Durability (ACID) properties
- Math, statistics, and procedural approaches like User Defined Functions (UDF) are not performed well enough
- Can be easily integrated to offer support with all languages
- Very easy to read, understand since SQL is a structured language

#### Benefits of using SQL and Python:
- Be it SQL or Python, each language comes with its own set of advantages. SQL has been designed to query and extract data. One of its main strengths includes merging data from multiple tables within a database. However, you cannot use SQL exclusively for performing higher-level data manipulations and transformations like regression tests, time series, etc. Python’s specialized library, Pandas, facilitates such data analysis. Therefore, you can use SQL to fetch data and further manipulate the structured data using Python.
- SQL commands are simpler and narrower vis-a-vis Python commands. More often than not, they form a combination of JOINS, aggregate functions, and subqueries functions.  

- As for Python, the programming commands are like an assortment of a Lego set, where each piece has a specific purpose. The libraries consist of specialized bits that help you build something in that particular niche. For example, Pandas are used for data analysis, Scikit-learn for machine learning, PyPDF2 for PDF manipulation, SciPy for numerical routines, and Numpy for mathematical operations and scientific computing. 

- Relational database management systems used in many corporate applications call for having prior knowledge of SQL. It provides a structured route to get the desired information. Conversely, Python offers more readability and portability, assisting the development of just about anything with the right tools and libraries. 


# how to sort dataframes by index

In [6]:
import pandas as pd
df = pd.DataFrame([1, 2, 3, 4, 5], index=[100, 29, 234, 1, 150],
columns=['A'])
df.sort_index() # sort index command line

Unnamed: 0,A
1,4
29,2
100,1
150,5
234,3


# how to sort dataframe by specific column in pandas
- Call pandas.DataFrame.sort_values(columns, ascending=True) with a list of column names to sort by as columns and either True or False as ascending to sort a DataFrame by column values.

In [23]:
A_sorted = df.sort_values(["A"], ascending=True)
A_sorted

Unnamed: 0,A
100,1
29,2
234,3
1,4
150,5


# how to combine more than 2 dataframes at the same time pandas in one step

- df = pd.concat( [df1,df2,df3], ignore_index=True )
- pd.merge(df1, df2, left_index=True, right_index=True, how='outer')
- # compile the list of dataframes you want to merge
- data_frames = [df1, df2, df3]
- df_merged = reduce(lambda  left,right: pd.merge(left,right,on=['key_col'], how='outer'), data_frames)