# Column selection with bracket and dot notation

In [1]:
import pandas as pd
import numpy as np

In [2]:
students_score = {
    'Name' : ['Paul', 'Aaron', 'Krista', 'Veronica', 'Paxton', 'Madison', 'Aurora'],
    'Score': [98, 89, 99, 87, 90, 83, 82]
}

In [3]:
df_students = pd.DataFrame(students_score, index=['s1', 's2', 's3', 's4', 's5', 's6', 's7'])

df_students

Unnamed: 0,Name,Score
s1,Paul,98
s2,Aaron,89
s3,Krista,99
s4,Veronica,87
s5,Paxton,90
s6,Madison,83
s7,Aurora,82


In [4]:
df_students['Name']

s1        Paul
s2       Aaron
s3      Krista
s4    Veronica
s5      Paxton
s6     Madison
s7      Aurora
Name: Name, dtype: object

In [5]:
df_students.Name

s1        Paul
s2       Aaron
s3      Krista
s4    Veronica
s5      Paxton
s6     Madison
s7      Aurora
Name: Name, dtype: object

In [6]:
df_students['Score']

s1    98
s2    89
s3    99
s4    87
s5    90
s6    83
s7    82
Name: Score, dtype: int64

In [7]:
df_students.Score

s1    98
s2    89
s3    99
s4    87
s5    90
s6    83
s7    82
Name: Score, dtype: int64

Always use bracket notation if there is any column name(s) with spaces, such as:

`df_students['Student Name']`

And if we want to select multiple columns, like:

`df_students[['Name', 'Score']]`

Otherwise we would get an error

In [18]:
students_score_2 = {
    'Student Name' : ['Pamela', 'Alessandro', 'Kryx', 'Vernon', 'Penny', 'Mifune', 'Autumn'],
    'Score': [98, 89, 99, 87, 90, 83, 82]
}

In [19]:
df_students_2 = pd.DataFrame(students_score_2)

In [20]:
df_students_2.student name

SyntaxError: invalid syntax (2459445292.py, line 1)

It returns an error.

Which won't be the case if we use bracket notation instead

In [21]:
df_students_2['Student Name']

0        Pamela
1    Alessandro
2          Kryx
3        Vernon
4         Penny
5        Mifune
6        Autumn
Name: Student Name, dtype: object

Now, pay attention to this:

In [8]:
df_students.columns

Index(['Name', 'Score'], dtype='object')

In [9]:
df_students.head

<bound method NDFrame.head of         Name  Score
s1      Paul     98
s2     Aaron     89
s3    Krista     99
s4  Veronica     87
s5    Paxton     90
s6   Madison     83
s7    Aurora     82>

Let's see another example. Let's say we want to create a dataframe with one or more of its column(s)
have the same names with python built in functions and attributes

In [10]:
shapes = {'shape': ['Round', 'Triangle', 'Rectangle', 'Square'],
          'head' : [1, 2, 3, 4]}

df_shapes = pd.DataFrame(shapes)

Now, if we access the column with bracket notation, we will get:

In [11]:
df_shapes.shape

(4, 2)

See, instead of showing the data in the 'shape' column, the code above returns the shape
of the dataset because the 'df_shape' dataframe has a built-in function called .shape

Similar thing happens if we access the 'head' column with dot notation

In [15]:
df_shapes.head

<bound method NDFrame.head of        shape  head
0      Round     1
1   Triangle     2
2  Rectangle     3
3     Square     4>

It returns a method

Now, we compare if we use bracket ( [] ) notation instead

In [16]:
df_shapes['head']

0    1
1    2
2    3
3    4
Name: head, dtype: int64

In [17]:
df_shapes['shape']

0        Round
1     Triangle
2    Rectangle
3       Square
Name: shape, dtype: object

Et voila! They actually return the values of each column