## Index Managment in Pandas

Index management in Pandas involves various operations related to accessing, setting, and manipulating the row labels of a DataFrame or Series. The index serves as a crucial component for efficient data retrieval, alignment, and manipulation.

In [3]:
import pandas as pd
df = pd.read_csv('D:\\python Programming\\DataSets\\data_jobs.csv')

In [4]:
df['job_posted_date'] = pd.to_datetime(df['job_posted_date'])

In [None]:
df.index.name = 'job_id'
df

'job_id'

In [26]:
median_pivot = df.pivot_table(values= 'salary_year_avg',index = 'job_title_short',aggfunc='median')
median_pivot

Unnamed: 0_level_0,salary_year_avg
job_title_short,Unnamed: 1_level_1
Business Analyst,85000.0
Cloud Engineer,90000.0
Data Analyst,90000.0
Data Engineer,125000.0
Data Scientist,127500.0
Machine Learning Engineer,106415.0
Senior Data Analyst,111175.0
Senior Data Engineer,147500.0
Senior Data Scientist,155500.0
Software Engineer,99150.0


In [27]:
median_pivot.index.name

'job_title_short'

**reset_index() : this method is used to bring back the old type of index i.e 0,1,2,3,4,5......**

In [None]:

# To make the changes permanant to the table we can use the inplace=True value to it
median_pivot.reset_index(inplace = True)
median_pivot


Unnamed: 0,job_title_short,salary_year_avg
0,Business Analyst,85000.0
1,Cloud Engineer,90000.0
2,Data Analyst,90000.0
3,Data Engineer,125000.0
4,Data Scientist,127500.0
5,Machine Learning Engineer,106415.0
6,Senior Data Analyst,111175.0
7,Senior Data Engineer,147500.0
8,Senior Data Scientist,155500.0
9,Software Engineer,99150.0


In [33]:
df.reset_index(inplace=True)

**set_index() : This method is used to set a unique colunm of the row index**

In [34]:
df_united_state = df[df['job_country'] == 'United States']
df_united_state.sample(5)

Unnamed: 0,job_id,job_title_short,job_title,job_location,job_via,job_schedule_type,job_work_from_home,search_location,job_posted_date,job_no_degree_mention,job_health_insurance,job_country,salary_rate,salary_year_avg,salary_hour_avg,company_name,job_skills,job_type_skills
732506,732506,Data Analyst,Data Analyst,"Austin, TX",via LinkedIn,Full-time,False,"Texas, United States",2023-11-09 21:01:11,False,False,United States,,,,TextNow,"['sql', 'snowflake', 'tableau', 'flow']","{'analyst_tools': ['tableau'], 'cloud': ['snow..."
324324,324324,Data Analyst,RA/QA Data Analytics,"Sunnyvale, CA",via Dice,Contractor,False,"California, United States",2023-07-06 17:01:07,False,False,United States,hour,,55.0,Mumba Technologies,"['sql', 'tableau', 'power bi', 'sap']","{'analyst_tools': ['tableau', 'power bi', 'sap..."
449117,449117,Senior Data Engineer,Senior Data Engineer,Louisiana,via LinkedIn,Contractor,False,"New York, United States",2023-05-26 14:06:18,False,False,United States,,,,Vuesol Technologies Inc,"['sql', 'scala', 'sql server', 'azure', 'datab...","{'cloud': ['azure', 'databricks'], 'databases'..."
477205,477205,Senior Data Scientist,URBN Senior Data Scientist,"Philadelphia, PA",via Big Country Jobs,Full-time,False,"New York, United States",2023-01-07 14:02:05,False,False,United States,,,,"URBN (Urban Outfitters, Anthropologie Group, F...","['sql', 'python', 'jupyter', 'tensorflow', 'py...","{'libraries': ['jupyter', 'tensorflow', 'pytor..."
286842,286842,Data Engineer,Sr. Data Engineer,"Longview, TX",via BeBee,Full-time,False,"Texas, United States",2023-11-28 17:07:28,False,True,United States,,,,Jobs for Humanity,"['scala', 'nosql', 'python', 'sql', 'java', 'm...","{'cloud': ['aws', 'azure', 'redshift', 'snowfl..."


In [36]:
df_united_state.set_index('job_id')
df_united_state.head(5)

Unnamed: 0,job_id,job_title_short,job_title,job_location,job_via,job_schedule_type,job_work_from_home,search_location,job_posted_date,job_no_degree_mention,job_health_insurance,job_country,salary_rate,salary_year_avg,salary_hour_avg,company_name,job_skills,job_type_skills
0,0,Senior Data Engineer,Senior Clinical Data Engineer / Principal Clin...,"Watertown, CT",via Work Nearby,Full-time,False,"Texas, United States",2023-06-16 13:44:15,False,False,United States,,,,Boehringer Ingelheim,,
3,3,Data Engineer,LEAD ENGINEER - PRINCIPAL ANALYST - PRINCIPAL ...,"San Antonio, TX",via Diversity.com,Full-time,False,"Texas, United States",2023-07-04 13:01:41,True,False,United States,,,,Southwest Research Institute,"['python', 'c++', 'java', 'matlab', 'aws', 'te...","{'cloud': ['aws'], 'libraries': ['tensorflow',..."
5,5,Data Engineer,GCP Data Engineer,Anywhere,via ZipRecruiter,Contractor and Temp work,True,Georgia,2023-11-07 14:01:59,False,False,United States,,,,smart folks inc,"['python', 'sql', 'gcp']","{'cloud': ['gcp'], 'programming': ['python', '..."
6,6,Senior Data Engineer,Senior Data Engineer - GCP Cloud,"Dearborn, MI",via LinkedIn,Full-time,False,"Florida, United States",2023-03-27 13:18:18,False,False,United States,,,,"Miracle Software Systems, Inc","['sql', 'python', 'java', 'sql server', 'gcp',...","{'cloud': ['gcp', 'bigquery'], 'databases': ['..."
9,9,Data Scientist,Data Scientist II,Anywhere,via ZipRecruiter,Full-time,True,"New York, United States",2023-04-23 13:02:57,False,False,United States,,,,"Radwell International, LLC","['sql', 'python', 'r', 'mongodb', 'mongodb', '...","{'analyst_tools': ['excel'], 'cloud': ['azure'..."


In [38]:
median_pivot.set_index('job_title_short' ,inplace=True)

In [39]:
median_pivot

Unnamed: 0_level_0,salary_year_avg
job_title_short,Unnamed: 1_level_1
Business Analyst,85000.0
Cloud Engineer,90000.0
Data Analyst,90000.0
Data Engineer,125000.0
Data Scientist,127500.0
Machine Learning Engineer,106415.0
Senior Data Analyst,111175.0
Senior Data Engineer,147500.0
Senior Data Scientist,155500.0
Software Engineer,99150.0


**sort_index() : This method is not that frequently directly used in dataset but it can be used in grouping values or pivot tables to sort the index data**

In [42]:
median_pivot.sort_values(by = 'salary_year_avg', ascending=False)

Unnamed: 0_level_0,salary_year_avg
job_title_short,Unnamed: 1_level_1
Senior Data Scientist,155500.0
Senior Data Engineer,147500.0
Data Scientist,127500.0
Data Engineer,125000.0
Senior Data Analyst,111175.0
Machine Learning Engineer,106415.0
Software Engineer,99150.0
Cloud Engineer,90000.0
Data Analyst,90000.0
Business Analyst,85000.0
