## Introduction to Databases

### Using SQLAlchemy 

Based in [this](https://medium.com/hacking-datascience/sqlalchemy-python-tutorial-abcc2ec77b57), [this](https://medium.com/dataexplorations/sqlalchemy-orm-a-more-pythonic-way-of-interacting-with-your-database-935b57fd2d4d) and [this](https://auth0.com/blog/sqlalchemy-orm-tutorial-for-python-developers/)

SQLAlchemy provides a nice “Pythonic” way of interacting with databases. So rather than dealing with the differences between specific dialects of traditional SQL such as MySQL or PostgreSQL or Oracle, you can leverage the Pythonic framework of SQLAlchemy to streamline your workflow and more efficiently query your data.


In [16]:
#!pip install mysqlclient
#!pip install python-dotenv
#!pip install sqlalchemy

In [17]:
import os
import getpass
import pandas as pd

import sqlalchemy

### Connecting to a database

To start interacting with the database we first we need to establish a connection.  

Some examples of connecting to various databases can be found [here](http://docs.sqlalchemy.org/en/latest/core/engines.html#postgresql)

### Viewing Table Details

SQLAlchemy can be used to automatically load tables from a database using something called reflection. Reflection is the process of reading the database and building the metadata based on that information.

In [9]:
engine = sqlalchemy.create_engine('sqlite:///../SampleDBs/chinook.sqlite')
connection = engine.connect()
metadata = sqlalchemy.MetaData()
albums = sqlalchemy.Table('albums', metadata, autoload=True, autoload_with=engine)

In [10]:
# Print the column names
print(albums.columns.keys())

['AlbumId', 'Title', 'ArtistId']


In [11]:
# Print full table metadata
print(repr(metadata.tables['albums']))

Table('albums', MetaData(bind=None), Column('AlbumId', INTEGER(), table=<albums>, primary_key=True, nullable=False), Column('Title', NVARCHAR(length=160), table=<albums>, nullable=False), Column('ArtistId', INTEGER(), ForeignKey('artists.ArtistId'), table=<albums>, nullable=False), schema=None)


### Querying

ResultProxy: The object returned by the .execute() method. It can be used in a variety of ways to get the data returned by the query.  

ResultSet: The actual data asked for in the query when using a fetch method such as .fetchall() on a ResultProxy.  

In [13]:
#Equivalent to 'SELECT * FROM albums'
query = sqlalchemy.select([albums])

In [14]:
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
#partial_results = ResultProxy.fetchmany(50)
ResultSet[:3]

[(1, 'For Those About To Rock We Salute You', 1),
 (2, 'Balls to the Wall', 2),
 (3, 'Restless and Wild', 2)]

Convert to dataframe

In [19]:
df = pd.DataFrame(ResultSet)
df.columns = ResultSet[0].keys()
df.head()

Unnamed: 0,AlbumId,Title,ArtistId
0,1,For Those About To Rock We Salute You,1
1,2,Balls to the Wall,2
2,3,Restless and Wild,2
3,4,Let There Be Rock,1
4,5,Big Ones,3


### Filtering data

Lets see some examples of raw SQLite Queries and queries using SQLAlchemy.

#### where

In [33]:
## SQL :SELECT * FROM artists WHERE Name = "Caetano Veloso" :

artists = sqlalchemy.Table('artists', metadata, autoload=True, autoload_with=engine)
query = sqlalchemy.select([artists]).where(artists.columns.Name == 'Caetano Veloso')
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
ResultSet

[(16, 'Caetano Veloso')]

#### in

In [34]:
## SQL : SELECT FirstName, LastName FROM customers WHERE state IN ("Rio de Janeiro", "New York")

customers = sqlalchemy.Table('customers', metadata, autoload=True, autoload_with=engine)
query = sqlalchemy.select([customers.columns.FirstName, 
                           customers.columns.LastName]).where(customers.columns.City.in_(['Rio de Janeiro', 'New York']))
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
ResultSet

[('Roberto', 'Almeida'), ('Michelle', 'Brooks')]

#### and, or, not

In [39]:
## SQL :SELECT * FROM customers WHERE City = 'Rio de Janeiro' AND NOT FirstName sex = 'Roberta'
    
customers = sqlalchemy.Table('customers', metadata, autoload=True, autoload_with=engine)
query = sqlalchemy.select([customers]).where(sqlalchemy.and_(customers.columns.City == "Rio de Janeiro",
                                                             customers.columns.FirstName != "Roberta")
                                            )
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
ResultSet

[(12, 'Roberto', 'Almeida', 'Riotur', 'Praça Pio X, 119', 'Rio de Janeiro', 'RJ', 'Brazil', '20040-020', '+55 (21) 2271-7000', '+55 (21) 2271-7070', 'roberto.almeida@riotur.gov.br', 3)]

#### order by

In [44]:
## SQL : SELECT * FROM customers ORDER BY City DESC, Country

customers = sqlalchemy.Table('customers', metadata, autoload=True, autoload_with=engine)
query = sqlalchemy.select([customers]).order_by(sqlalchemy.desc(customers.columns.City), customers.columns.Country)
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
ResultSet[:3]

[(33, 'Ellie', 'Sullivan', None, '5112 48 Street', 'Yellowknife', 'NT', 'Canada', 'X1A 1N6', '+1 (867) 920-2233', None, 'ellie.sullivan@shaw.ca', 3),
 (32, 'Aaron', 'Mitchell', None, '696 Osborne Street', 'Winnipeg', 'MB', 'Canada', 'R3L 2B9', '+1 (204) 452-6452', None, 'aaronmitchell@yahoo.ca', 4),
 (49, 'Stanisław', 'Wójcik', None, 'Ordynacka 10', 'Warsaw', None, 'Poland', '00-358', '+48 22 828 37 39', None, 'stanisław.wójcik@wp.pl', 4)]

#### functions
other functions include avg, count, min, max…  

In [48]:
##SQL : SELECT SUM(Total) FROM invoices 
    
invoices = sqlalchemy.Table('invoices', metadata, autoload=True, autoload_with=engine)
query = sqlalchemy.select([sqlalchemy.func.sum(invoices.columns.Total)])
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
ResultSet

  util.warn(


[(Decimal('2328.60'),)]

#### group by

In [51]:
##SQL : SELECT SUM(Total) as Total, InvoiceDate FROM invoices GROUP BY InvoiceDate

invoices = sqlalchemy.Table('invoices', metadata, autoload=True, autoload_with=engine)
query = sqlalchemy.select([sqlalchemy.func.sum(invoices.columns.Total).label('Total'), 
                           invoices.columns.InvoiceDate]).group_by(invoices.columns.InvoiceDate)

ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
ResultSet[:5]

[(Decimal('1.98'), datetime.datetime(2009, 1, 1, 0, 0)),
 (Decimal('3.96'), datetime.datetime(2009, 1, 2, 0, 0)),
 (Decimal('5.94'), datetime.datetime(2009, 1, 3, 0, 0)),
 (Decimal('8.91'), datetime.datetime(2009, 1, 6, 0, 0)),
 (Decimal('13.86'), datetime.datetime(2009, 1, 11, 0, 0))]

#### distinct

In [52]:
## SQL : SELECT DISTINCT state FROM censusSQLAlchemy :

invoices = sqlalchemy.Table('invoices', metadata, autoload=True, autoload_with=engine)
query = sqlalchemy.select([invoices.columns.CustomerId.distinct()])

ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
ResultSet[:5]

[(1,), (2,), (3,), (4,), (5,)]

#### joins

In [57]:
artists = sqlalchemy.Table('artists', metadata, autoload=True, autoload_with=engine)
albums = sqlalchemy.Table('albums', metadata, autoload=True, autoload_with=engine)

### Automatic Join

In [61]:
query = sqlalchemy.select([artists.columns.Name, albums.columns.Title])
results = connection.execute(query).fetchall()
df = pd.DataFrame(results)
df.columns = results[0].keys()
df.head(5)

Unnamed: 0,Name,Title
0,AC/DC,For Those About To Rock We Salute You
1,AC/DC,Balls to the Wall
2,AC/DC,Restless and Wild
3,AC/DC,Let There Be Rock
4,AC/DC,Big Ones


### Manual Join

In [66]:
query = sqlalchemy.select([artists, albums])
query = query.select_from(artists.join(albums, artists.columns.ArtistId == albums.columns.ArtistId))
results = connection.execute(query).fetchall()
df = pd.DataFrame(results)
df.columns = results[0].keys()
df.head(5)

Unnamed: 0,ArtistId,Name,AlbumId,Title,ArtistId.1
0,1,AC/DC,1,For Those About To Rock We Salute You,1
1,2,Accept,2,Balls to the Wall,2
2,2,Accept,3,Restless and Wild,2
3,1,AC/DC,4,Let There Be Rock,1
4,3,Aerosmith,5,Big Ones,3


## Creating and Inserting Data into Tables

By passing the database which is not present, to the engine then sqlalchemy automatically creates a new database.

In [68]:
engine = sqlalchemy.create_engine('sqlite:///../SampleDBs/test.sqlite') #Create test.sqlite automatically
connection = engine.connect()
metadata = sqlalchemy.MetaData()

emp = sqlalchemy.Table('emp', metadata,
                       sqlalchemy.Column('Id', sqlalchemy.Integer()),
                       sqlalchemy.Column('name', sqlalchemy.String(255), nullable=False),
                       sqlalchemy.Column('salary', sqlalchemy.Float(), default=100.0),
                       sqlalchemy.Column('active', sqlalchemy.Boolean(), default=True)
                      )

metadata.create_all(engine) #Creates the table

In [69]:
#Inserting record one by one

query = sqlalchemy.insert(emp).values(Id=1, name='naveen', salary=60000.00, active=True) 
ResultProxy = connection.execute(query)

In [70]:
#Inserting many records at ones

query = sqlalchemy.insert(emp) 
values_list = [{'Id':'2', 'name':'ram', 'salary':80000, 'active':False},
               {'Id':'3', 'name':'ramesh', 'salary':70000, 'active':True}]
ResultProxy = connection.execute(query,values_list)

In [71]:
results = connection.execute(sqlalchemy.select([emp])).fetchall()
df = pd.DataFrame(results)
df.columns = results[0].keys()
df.head(4)

Unnamed: 0,Id,name,salary,active
0,1,naveen,60000.0,True
1,2,ram,80000.0,False
2,3,ramesh,70000.0,True


### Updating data in Databases

db.update(table_name).values(attribute = new_value).where(condition)

In [72]:
results = connection.execute(sqlalchemy.select([emp])).fetchall()
df = pd.DataFrame(results)
df.columns = results[0].keys()
df.head(4)

Unnamed: 0,Id,name,salary,active
0,1,naveen,60000.0,True
1,2,ram,80000.0,False
2,3,ramesh,70000.0,True


In [73]:
# Build a statement to update the salary of Id = 1 to to 100000
query = sqlalchemy.update(emp).values(salary = 100000)
query = query.where(emp.columns.Id == 1)
results = connection.execute(query)

In [74]:
results = connection.execute(sqlalchemy.select([emp])).fetchall()
df = pd.DataFrame(results)
df.columns = results[0].keys()
df.head(4)

Unnamed: 0,Id,name,salary,active
0,1,naveen,100000.0,True
1,2,ram,80000.0,False
2,3,ramesh,70000.0,True


#### Delete Table

db.delete(table_name).where(condition)

In [76]:
results = connection.execute(sqlalchemy.select([emp])).fetchall()
df = pd.DataFrame(results)
df.columns = results[0].keys()
df.head(4)

Unnamed: 0,Id,name,salary,active
0,1,naveen,100000.0,True
1,2,ram,80000.0,False
2,3,ramesh,70000.0,True


In [77]:
# Build a statement to delete where salary < 100000
query = sqlalchemy.delete(emp)
query = query.where(emp.columns.salary < 100000)
results = connection.execute(query)

In [78]:
results = connection.execute(sqlalchemy.select([emp])).fetchall()
df = pd.DataFrame(results)
df.columns = results[0].keys()
df.head(4)

Unnamed: 0,Id,name,salary,active
0,1,naveen,100000.0,True


#### Dropping a Table

table_name.drop(engine) #drops a single table  
metadata.drop_all(engine) #drops all the tables in the database

## A full example of Database creation and management using SQLAlchemy ORM

![](../SampleDBs/SQLAlchemyORM.png)

#### Let's set up this database:

points_of_interest is the main table which has a zero-to-many relationship with the other three tables. For example, a given point of interest may have no associated Architects or it may have several Architects.

### Option 1 — Raw SQL

In [80]:
from sqlalchemy import create_engine
#db = create_engine('dialect+driver://user:pass@host:port/db')
#db = create_engine(f'postgresql://{DB_USER}:{DB_PASS}@{IP}:{DB_PORT}/{DB_NAME}')

engine = create_engine('sqlite:///../SampleDBs/architect.sqlite')

In [None]:
# create main table

db.execute("""
CREATE TABLE IF NOT EXISTS points_of_interest (
    poi_id BIGSERIAL PRIMARY KEY,
    name text,
    build_year text, 
    demolished_year text,
    address text, 
    latitude float, 
    longitude float,
    source text, 
    external_url text, 
    details text,
    image_url text, 
    heritage_status text, 
    current_use text, 
    poi_type text)
    """)

# create architectural styles TABLE

db.execute("""
CREATE TABLE IF NOT EXISTS architectural_styles (
    poi_id int,style text)
    """)

# create architects TABLE

db.execute("""
CREATE TABLE IF NOT EXISTS architects (
    poi_id int,
    architect_name text)
    """)

# create categories TABLE

db.execute("""CREATE TABLE IF NOT EXISTS poi_categories (
    poi_id int,
    category text)
    """)

This code does not establish a formal relationship between the tables (i.e. it is up to me as the developer to maintain the links). So, for example, as I started to load data from a pandas data frame into the database, I had to save the generated primary key from the PointsOfInterest table and use that when submitting entries to the ArchitecturalStyles table

### Option 2 — SQLAlchemy ORM

In [None]:
from sqlalchemy import create_engine, Column, Integer, String, Sequence, Float, PrimaryKeyConstraint, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship, backref

### Using SQL Alchemy ORM

In [85]:
try:
    os.remove("../SampleDBs/architect.sqlite")
    print("removed file")
except:
    print("file does not exist")

file does not exist


In [87]:
db = create_engine('sqlite:///../SampleDBs/architect.sqlite')

#### Create Tables as Classes

Instead of writing SQL INSERT statements, you will define a class to represent each of your tables.

+ Class inherits from the declarative_base() class
+ define __tablename__ (the actual table name that is created in the database
+ define Columns and their types (note: these types are different from the PostgreSQL specific text, float and bigserial types that I originally defined)

How do I set up an auto-incrementing primary key like BIGSERIAL?

There is no BIGSERIAL-equivalent column type in ORM, but you can accomplish the same thing by explicitly defining a sequence:

poi_id = Column(Integer, Sequence('poi_id_seq'), primary_key=True)

In [88]:
Base = declarative_base()

In [90]:
class PointsOfInterest(Base):
    __tablename__ = "points_of_interest"
    poi_id = Column(Integer, Sequence('poi_id_seq'), primary_key=True)
    name = Column(String)
    build_year = Column(String)
    build_decade = Column(Float)
    build_year_clean = Column(Float)
    demolished_year = Column(String)
    address = Column(String)
    latitude = Column(Float)
    longitude = Column(Float)
    external_url = Column(String)
    image_url = Column(String)
    heritage_status = Column(String)
    current_use = Column(String)
    poi_type = Column(String)
    poi_type_simple = Column(String)
    source = Column(String)
    details = Column(String)
    
    #Defining One to Many relationships with the relationship function on the Parent Table
    styles = relationship('ArchitecturalStyles', backref = 'points_of_interest',lazy=True,cascade="all, delete-orphan")
    architects = relationship('Architects', backref = 'points_of_interest', lazy=True,cascade="all, delete-orphan")
    categories = relationship('POICategories', backref = 'points_of_interest', lazy=True,cascade="all, delete-orphan")
    
    def print_me(self):
        print(f"id: {self.poi_id}")
        for col in self.__table__.columns:
            print(col.name, ":    ", getattr(self,col.name))

### Create the supporting / foreign-key tables

SQLAlchemy ORM requires you to define a primary key. We have to define a primary key encompassing both columns . This has the added benefit of preventing you from inserting duplicate architectural styles for the same building (or from inserting null entries)  

#### One-to-Many Relationship Pattern / Foreign Key

You can set up several types of relationship patterns in SQLAlchemy ORM, but I only needed to use the One-To-Many relationship. I found this part confusing and at first only defined the foreign key on the supplementary table. But you can take more advantage of the power of ORM if you also define the relationship on the main table.

The first step is indeed to define your Foreignkey and point to the table.column of interest

The next step is to define a bidirectional relationship between the two tables (a zero-to-many from PointsOfInterest to ArchitecturalStyles and a many-to-one from ArchitecturalStyles to PointsOfInterest). This makes it easier, for example, when loading data since ORM takes care of automatically using the newly generated poi_id when populating data into the other tables.

To do this, add a relationship to the main PointsofInterest class and use the backref parameter to connect the two (see the last cell) 

+ styles = The relationship with the AchitecturalStyles will be named “styles” (this attribute will be exposed when working with the PointsOfInterest class)
+ backref = connects the two classes
+ lazy = determines how the supporting tables get loaded when you query the main table. lazy=Trueis the default option and works for me here
+ cascade = if I delete a poi_id from the main table, the linked rows in architectural styles will be deleted too (so I don’t end up with orphaned entries)

Since I’m referencing the ArchitecturalStyles class from the PointsOfInterest class and vice versa, I kept running into “not yet defined” errors when running the code. The secret was to define the main class first (PointsOfInterest) and use quotes around the referenced class names when defining the relationship( relationship='ArchitecturalStyles'). This tells SQLAlchemy to create a placeholder for a class that will be defined later

In [91]:
class ArchitecturalStyles(Base):
    __tablename__="architectural_styles"
    __table_args__ = (PrimaryKeyConstraint('poi_id', 'style'),)
    poi_id = Column(Integer,ForeignKey('points_of_interest.poi_id'))
    #Defining the Foreign Key on the Child Table
    style = Column(String)

In [92]:
class Architects(Base):
    __tablename__="architects"
    __table_args__ = (PrimaryKeyConstraint('poi_id', 'architect_name'),)
    poi_id= Column(Integer,ForeignKey('points_of_interest.poi_id'))
    architect_name = Column(String)

In [93]:
class POICategories(Base):
    __tablename__="poi_categories"
    __table_args__ = (PrimaryKeyConstraint('poi_id', 'category'),)
    poi_id =Column(Integer,ForeignKey('points_of_interest.poi_id'))
    category = Column(String)

Create the database tables

Run this command to actually create the tables in the database
    
    checkfirst = check if table already exists and skip the creation if it already exists

In [94]:
engine = connect_db()
PointsOfInterest.__table__.create(bind=engine, checkfirst=True)
ArchitecturalStyles.__table__.create(bind=engine, checkfirst=True)
Architects.__table__.create(bind=engine, checkfirst=True)
POICategories.__table__.create(bind=engine, checkfirst=True)

## Using SQL Alchemy ORM

### Inserting Rows

Now that my classes were defined, I could use them in other modules to help load data. As usual, the first step is to import the necessary classes, including our new class definitions (PointsOfInterest etc) and establish a session:

In [95]:
db = connect_db() #establish connection
Session = sessionmaker(bind=db)
session = Session()

The following function saves data from a dataframe to the database.

+ Updates three tables: points_of_interest, architectural_styles, architects
+ In this case, each Point of Interest only has only architectural style, so I define an instance of the ArchitecturalStyles class ( style=ArchitecturalStyles(style=row['Style'])) and then append that to the Point of Interest class ( poi.styles.append(style) . Now when I commit this transaction, the entry in ArchitecturalStyles will automatically be assigned the new poi_id generated in the main table)
+ Similarly, there may be many architects, so I loop through the list and append one or more Architects instances to the Point of Interest

In [None]:
def save_to_database_ORM(session):
    for index, row in bld_df.iterrows():
        poi_dict={'name': row['Name'],
                  address=row['Street']}
        poi = PointsOfInterest(**poi_dict )        # define style (in ArchitecturalStyles class)
        style = ArchitecturalStyles(style=row['Style'])
        poi.styles.append(style)
 
        # architects (can be multiple)
        for company in row['Companies']:
            architect = Architects(architect_name= company)
            poi.architects.append(architect)

        session.add(poi)
        session.commit()

In [None]:
Accessing data

More details can be found in the help docs, but here are a few quick tips on accessing data through SQLAlchemy ORM:

    Get Count of rows in table

# get the count
session.query(PointsOfInterest).count()

    get an object by primary key

poi= session.query(PointsOfInterest).get(30)

    filter on particular column/value

session.query(PointsOfInterest).filter(PointsOfInterest.build_year=='1905')

    load to pandas dataframe

df = pd.read_sql(session.query(PointsOfInterest).statement, session.bind)

    access linked table — because of the defined relationship, you can access a styles and architects list for each PointsOfInterest object

poi= session.query(PointsOfInterest).get(30)
for y in poi.architects:
    print(y.architect_name)

    delete entry (cascades to linked tables, so if this point has an associated entry in the Architects table, it will be deleted too)

poi_to_delete = session.query(PointsOfInterest).filter(PointsOfInterest.poi_id==<id>).first()
session.delete(poi_to_delete)
session.commit()

Criação das Tabelas

In [41]:
db = connect_db()
Session = sessionmaker(bind=db)
session = Session()

Inserção dos Dados

In [42]:
session.add(Edificio(nome = 'Linked Hybrid', ano_de_construcao = '2009', endereco = 'Pequim - China'))
session.add(Edificio(nome = 'Casa Moby Dick', ano_de_construcao = '2003', endereco = 'Espoo - Finlândia'))
session.add(Edificio(nome = 'Apartamentos WoZoCo', ano_de_construcao = '1997', endereco = 'Amsterdã - Holanda'))
session.add(Edificio(nome = 'Casas Cubo', ano_de_construcao = '1984', endereco = 'Roterdã - Holanda'))
session.add(Edificio(nome = 'Museu Perot de Ciência Natural', ano_de_construcao = '2012', endereco = 'Dallas - Estados Unidos'))
session.commit()

In [43]:
session.add(CategoriaEdificio(id_edificio = 1, categoria = "Comércio"))
session.add(CategoriaEdificio(id_edificio = 2, categoria = "Exposição"))
session.add(CategoriaEdificio(id_edificio = 3, categoria = "Moradia"))
session.add(CategoriaEdificio(id_edificio = 4, categoria = "Comércio"))
session.add(CategoriaEdificio(id_edificio = 5, categoria = "Museu"))
session.commit()

In [44]:
session.add(Arquiteto(id_edificio = 1, nome_arquiteto = "Su Dong Xia"))
session.add(Arquiteto(id_edificio = 2, nome_arquiteto = "Paavo Seppo Aadolf"))
session.add(Arquiteto(id_edificio = 3, nome_arquiteto = "Norbert Ágoston Odd"))
session.add(Arquiteto(id_edificio = 4, nome_arquiteto = "Jere Petri Aulis"))
session.add(Arquiteto(id_edificio = 5, nome_arquiteto = "Jóhannes Amrit Bogdan"))
session.commit()

Exploração

In [45]:
session.query(Edificio).count()

10

In [46]:
deletar_arquiteto = session.query(Arquiteto).filter(Arquiteto.id_edificio==1).first()
session.delete(deletar_arquiteto)
session.commit()

In [47]:
import pandas as pd
df = pd.read_sql(session.query(Edificio).statement, session.bind)

In [48]:
df

Unnamed: 0,id_edificio,nome,ano_de_construcao,endereco
0,1,Linked Hybrid,2009,Pequim - China
1,2,Casa Moby Dick,2003,Espoo - Finlândia
2,3,Apartamentos WoZoCo,1997,Amsterdã - Holanda
3,4,Casas Cubo,1984,Roterdã - Holanda
4,5,Museu Perot de Ciência Natural,2012,Dallas - Estados Unidos
5,6,Linked Hybrid,2009,Pequim - China
6,7,Casa Moby Dick,2003,Espoo - Finlândia
7,8,Apartamentos WoZoCo,1997,Amsterdã - Holanda
8,9,Casas Cubo,1984,Roterdã - Holanda
9,10,Museu Perot de Ciência Natural,2012,Dallas - Estados Unidos


In [49]:
df.query("ano_de_construcao >= '2000'")

Unnamed: 0,id_edificio,nome,ano_de_construcao,endereco
0,1,Linked Hybrid,2009,Pequim - China
1,2,Casa Moby Dick,2003,Espoo - Finlândia
4,5,Museu Perot de Ciência Natural,2012,Dallas - Estados Unidos
5,6,Linked Hybrid,2009,Pequim - China
6,7,Casa Moby Dick,2003,Espoo - Finlândia
9,10,Museu Perot de Ciência Natural,2012,Dallas - Estados Unidos


In [50]:
df.query('endereco.str.contains("Holanda")', engine='python')

Unnamed: 0,id_edificio,nome,ano_de_construcao,endereco
2,3,Apartamentos WoZoCo,1997,Amsterdã - Holanda
3,4,Casas Cubo,1984,Roterdã - Holanda
7,8,Apartamentos WoZoCo,1997,Amsterdã - Holanda
8,9,Casas Cubo,1984,Roterdã - Holanda
