# **Guided Lab 343.3.6 - Selecting Columns in Pandas DataFrames**

---



## **Learning Objective:**
This lab focuses on accessing and selecting specific columns from Pandas DataFrames, a fundamental skill in data manipulation and analysis.

By the end this lab, learners will be able to Select any specified column or columns from Pandas Dataframe.

**Introduction:**

Pandas DataFrames provide flexible ways to select columns, including:

1. **Column Attribute:** Use square brackets `[]` with the column name for single column selection or a list of column names for multiple columns.
2. **Column Index Number:** Access columns using their index number within the DataFrame.

**Lab Activities:**

1. **Import Dataset:** Begin by importing the `cars.json` dataset into a Pandas DataFrame named `df_cars`.
2. **Select Single Column:** Extract the 'Car' column to demonstrate single column selection.
3. **Select Multiple Columns:** Select 'Car', 'Model', and 'quantity' columns to illustrate multiple column selection.
4. **Select by Column Index:** Practice selecting columns using their index number.
5. **Continue Learning:** This lab serves as an introduction. Further exploration of column selection techniques will be covered later in the module.



## **Introduction:**
**Selecting columns by using column attribute**

- To select a single column, use square brackets [ ] with the column name of the column of interest.

- To select multiple columns, use a list of column names within the selection brackets [ ].

**Syntax:**

```
# select column to Series
s = df['colName']

# select column to dataframe
df = df[['colName']]

# select two or more column
df = df[['ColOne','colTwo']]  

# select column by column index number
s = df[df.columns[0]]  

# select columns by column index numbers
df = df.columns[[0, 3, 4]]
```





---

## **Dataset:**

The lab utilizes the [`cars.json`](https://drive.google.com/file/d/1CXAK8gbuLtc2NNOXVUgmja8fDg0TrNZm/view) dataset, providing a practical context for applying column selection methods.

 Let import cars dataset in Panda dataframe.

In [1]:
import pandas as pd

In [6]:

df_cars = pd.read_json('./Data/cars.json')
df_cars


Unnamed: 0,Car,MPG,Cylinders,Displacement,Horsepower,Weight,Acceleration,Model,Origin,quantity,city
0,Chevrolet Vega,25.0,4,140.0,75,2542,17.0,74,US,177,NJ
1,Chevrolet Vega (sw),22.0,4,140.0,72,2408,19.0,71,US,91,DALLAS
2,Chevrolet Vega 2300,28.0,4,140.0,90,2264,15.5,71,US,74,TEXAS
3,Chevrolet Woody,24.5,4,98.0,60,2164,22.1,76,US,241,OH
4,Chevrolete Chevelle Malibu,16.0,6,250.0,105,3897,18.5,75,US,206,NewYork
...,...,...,...,...,...,...,...,...,...,...,...
156,Mercury Capri v6,21.0,6,155.0,107,2472,14.0,73,US,158,NewYork
157,Mercury Cougar Brougham,15.0,8,302.0,130,4295,14.9,77,US,27,NJ
158,Mercury Grand Marquis,16.5,8,351.0,138,3955,13.2,79,US,332,DALLAS
159,Mercury Lynx l,36.0,4,98.0,70,2125,17.3,82,US,425,TEXAS


## **Example - Select One Column:**

Suppose we are interested in the Name of the cars.

In [7]:
df_cars['Car']

0                  Chevrolet Vega
1             Chevrolet Vega (sw)
2             Chevrolet Vega 2300
3                 Chevrolet Woody
4      Chevrolete Chevelle Malibu
                  ...            
156              Mercury Capri v6
157       Mercury Cougar Brougham
158         Mercury Grand Marquis
159                Mercury Lynx l
160               Mercury Marquis
Name: Car, Length: 161, dtype: object

In [9]:
df_cars['Car'].head(10)

0                Chevrolet Vega
1           Chevrolet Vega (sw)
2           Chevrolet Vega 2300
3               Chevrolet Woody
4    Chevrolete Chevelle Malibu
5                     Chevy C20
6                    Chevy S-10
7              Chrysler Cordoba
8    Chrysler Lebaron Medallion
9        Chrysler Lebaron Salon
Name: Car, dtype: object



---



## **Example: Select multiple Columns**

Suppose, we are interested in the cars name, Model of the cars, and quantity

In [17]:
print(df_cars[['Car','Model', 'quantity']].head(4))
print()
print(df_cars[['Horsepower','Weight']].head(4))


                   Car  Model  quantity
0       Chevrolet Vega     74       177
1  Chevrolet Vega (sw)     71        91
2  Chevrolet Vega 2300     71        74
3      Chevrolet Woody     76       241

   Horsepower  Weight
0          75    2542
1          72    2408
2          90    2264
3          60    2164


## **Example: Select single column by column index number**

In [14]:
print("Column 1:")
print(df_cars[df_cars.columns[0]].head(5), '\n')

print("Column 3:")
print(df_cars[df_cars.columns[2]].head(5), '\n')


Column 1:
0                Chevrolet Vega
1           Chevrolet Vega (sw)
2           Chevrolet Vega 2300
3               Chevrolet Woody
4    Chevrolete Chevelle Malibu
Name: Car, dtype: object 

Column 3:
0    4
1    4
2    4
3    4
4    6
Name: Cylinders, dtype: int64 



## **Example: Select Multiple columns by column index number**

In [23]:
print(df_cars[df_cars.columns[[0,1,9]]].head(2))
print()
print(df_cars[df_cars.columns[[7, 3, 5]]].head(2))

                   Car   MPG  quantity
0       Chevrolet Vega  25.0       177
1  Chevrolet Vega (sw)  22.0        91

   Model  Displacement  Weight
0     74         140.0    2542
1     71         140.0    2408




---



**To be continue: We will continue this lab, later in this Module.**

## **Submission Instructions**
- Submit your completed lab using the Start Assignment button on the assignment page in Canvas.
- Your submission can be include:
  - if you are using notebook then, all tasks should be written and submitted in a single notebook file, for example: (**your_name_labname.ipynb**).
  - if you are using python script file, all tasks should be written and submitted in a single python script file for example: **(your_name_labname.py)**.
- Add appropriate comments and any additional instructions if required.
