## 83. Remove Duplicates from Sorted List

Given an array nums of size n, return the majority element.

The majority element is the element that appears more than ⌊n / 2⌋ times. You may assume that the majority element always exists in the array.

Submit solution here: https://leetcode.com/problems/majority-element/description/ 

## SQLAlchemy

Some more SQL-Alchemy syntax.

First, let's pull up the code that allowed us to create our `epi_country` and `gdp` objects.

More information on the session object here: https://docs.sqlalchemy.org/en/20/orm/session_basics.html

In [48]:
from sqlalchemy import create_engine, func
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
import pandas as pd

# postgresql+psycopg2://postgres:@localhost/epi

# the `create_engine` function prepares a connection to the database
# should this info be public? 
engine = create_engine('postgresql+psycopg2://postgres:password@localhost:5434/epi')

# this object will automatically map our db entity into a Python class
Base = automap_base()

# get db into automapper
Base.prepare(engine, reflect=True)

# save classes as variables, prepare classes
epi_country = Base.classes.epi_country
gdp = Base.classes.economic

# query our database (pull data and save into objects)
session = Session(engine)

We can utilize the `func` object to apply aggregate functions to our queries just like we did with Postgres.

https://docs.sqlalchemy.org/en/14/core/sqlelement.html#sqlalchemy.sql.expression.func

Let's calculate how many rows we have in our tables before pulling.

In [49]:
session.query(func.count(gdp.country))

<sqlalchemy.orm.query.Query at 0x2189ab437f0>

Without the `.all()` function, we return an object instead of a list of tuples.

In [50]:
session.query(func.count(gdp.country)).all()

[(192,)]

If we want to return how many times a certain country shows up in our table, we can construct a query similiar to

```sql
SELECT country, count(*)
FROM economic
GROUP BY country
```

via

In [51]:
session.query(gdp.country, func.count(gdp.country)).group_by(gdp.country).all()

[('Indonesia', 3),
 ('Switzerland', 4),
 ('New Zealand', 4),
 ('Italy', 4),
 ('Hungary', 4),
 ("China (People's Republic of)", 3),
 ('Russia', 4),
 ('Luxembourg', 4),
 ('Korea', 4),
 ('Czech Republic', 4),
 ('Sweden', 4),
 ('Norway', 4),
 ('United Kingdom', 4),
 ('Netherlands', 4),
 ('Brazil', 3),
 ('Austria', 4),
 ('Australia', 4),
 ('Ireland', 4),
 ('Germany', 4),
 ('G7', 4),
 ('European Union ? 27 countries (from 01/02/2020)', 4),
 ('Canada', 4),
 ('Portugal', 4),
 ('Finland', 4),
 ('Colombia', 4),
 ('Lithuania', 4),
 ('Slovak Republic', 4),
 ('Spain', 4),
 ('Latvia', 4),
 ('Slovenia', 4),
 ('Turkiye', 4),
 ('Greece', 4),
 ('India', 3),
 ('Belgium', 4),
 ('Chile', 4),
 ('Euro area (19 countries)', 4),
 ('France', 4),
 ('Estonia', 4),
 ('Israel', 4),
 ('South Africa', 4),
 ('Mexico', 4),
 ('OECD - Total', 4),
 ('Poland', 4),
 ('Iceland', 4),
 ('Costa Rica', 4),
 ('Japan', 4),
 ('Denmark', 4),
 ('Haiti', 1),
 ('BRIICS economies - Brazil, Russia, India, Indonesia, China and South Afric

We can go further with function chaining in sqlalchemy to create the following:

In [52]:
session.query(gdp.country, func.count(gdp.country)).group_by(gdp.country).order_by(func.count(gdp.country).desc()).all()

[('United States', 4),
 ('Switzerland', 4),
 ('New Zealand', 4),
 ('Italy', 4),
 ('Hungary', 4),
 ('OECD - Total', 4),
 ('Poland', 4),
 ('Iceland', 4),
 ('Costa Rica', 4),
 ('Japan', 4),
 ('Denmark', 4),
 ('Russia', 4),
 ('Luxembourg', 4),
 ('Korea', 4),
 ('Czech Republic', 4),
 ('Sweden', 4),
 ('Norway', 4),
 ('United Kingdom', 4),
 ('Netherlands', 4),
 ('Austria', 4),
 ('Australia', 4),
 ('Ireland', 4),
 ('Germany', 4),
 ('G7', 4),
 ('European Union ? 27 countries (from 01/02/2020)', 4),
 ('Canada', 4),
 ('Portugal', 4),
 ('Finland', 4),
 ('Colombia', 4),
 ('Lithuania', 4),
 ('Slovak Republic', 4),
 ('Spain', 4),
 ('Latvia', 4),
 ('Slovenia', 4),
 ('Turkiye', 4),
 ('Greece', 4),
 ('Belgium', 4),
 ('Chile', 4),
 ('Euro area (19 countries)', 4),
 ('France', 4),
 ('Estonia', 4),
 ('Israel', 4),
 ('South Africa', 4),
 ('Mexico', 4),
 ("China (People's Republic of)", 3),
 ('India', 3),
 ('BRIICS economies - Brazil, Russia, India, Indonesia, China and South Africa', 3),
 ('Indonesia', 3),


Going back to yesterday, however, we can utilize the following code to get every single column available within a table.

We simply place the entire object within the `query` function

In [26]:
rows = session.query(epi_country)
for row in rows.all():
    print(row)

<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE650>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE5F0>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE680>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE6E0>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE740>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE770>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE800>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE890>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE920>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EE9B0>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EEA40>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EEAD0>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EEB60>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EEBF0>
<sqlalchemy.ext.automap.epi_country object at 0x00000218958EEC80>
<sqlalchem

Notice that this gives you each row saved as object. If we would like to get the discrete values saved in each object, we can create another inner-for-loop.

In [None]:
"""
rows = session.query(epi_country)
for row in rows.all():
    for key in row.__dict__.keys():
        print(row.__dict__[key], end=" ")
    print("")
"""
rows = engine.execute("SELECT * FROM economic")
print(rows.fetchall())

You will notice that if we attempt an erroneous query, this will actually halt our workflow, and we **must** rollback our query to get it working again.

In [61]:
# bad query
#session.query(gdp.country, func.count(gdp.country)).all()
session.query(gdp.country, func.count(gdp.country)).group_by(gdp.country).all()

# rollback
session.rollback()

[('Indonesia', 3),
 ('Switzerland', 4),
 ('New Zealand', 4),
 ('Italy', 4),
 ('Hungary', 4),
 ("China (People's Republic of)", 3),
 ('Russia', 4),
 ('Luxembourg', 4),
 ('Korea', 4),
 ('Czech Republic', 4),
 ('Sweden', 4),
 ('Norway', 4),
 ('United Kingdom', 4),
 ('Netherlands', 4),
 ('Brazil', 3),
 ('Austria', 4),
 ('Australia', 4),
 ('Ireland', 4),
 ('Germany', 4),
 ('G7', 4),
 ('European Union ? 27 countries (from 01/02/2020)', 4),
 ('Canada', 4),
 ('Portugal', 4),
 ('Finland', 4),
 ('Colombia', 4),
 ('Lithuania', 4),
 ('Slovak Republic', 4),
 ('Spain', 4),
 ('Latvia', 4),
 ('Slovenia', 4),
 ('Turkiye', 4),
 ('Greece', 4),
 ('India', 3),
 ('Belgium', 4),
 ('Chile', 4),
 ('Euro area (19 countries)', 4),
 ('France', 4),
 ('Estonia', 4),
 ('Israel', 4),
 ('South Africa', 4),
 ('Mexico', 4),
 ('OECD - Total', 4),
 ('Poland', 4),
 ('Iceland', 4),
 ('Costa Rica', 4),
 ('Japan', 4),
 ('Denmark', 4),
 ('Haiti', 1),
 ('BRIICS economies - Brazil, Russia, India, Indonesia, China and South Afric

We notice that the `economic` table is severly lacking in content. Let's add a few more rows of data to our database.

Namely, we will `INSERT` Haiti's 2010 GDP information into the economic table using our `session` object.

https://docs.sqlalchemy.org/en/20/orm/session_api.html#sqlalchemy.orm.Session.add

In [62]:
# we first create a new object

new_country = gdp(location="HAI", country="Haiti", subject_code="T_GDP", subject="Gross Domestic Product (GDP); millions", measure_code="VPVOB", measure="USD, constant prices, 2015 PPPs", year=2010, unit_code="USD", unit="US Dollar", 
            power_code_id=6, power_code="Millions", value=11860)
session.add(new_country)

Just like with Git, our changes are not yet reflected until we commit

In [None]:
session.commit()

As it applies to pandas, we can actually convert an entire dataframe into a sql table via the following command.

`data` in this context is the HDI dataframe.

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html

In [43]:
import pandas as pd

hdi = pd.read_csv("data/HDI.csv")
hdi.head()

Unnamed: 0,Id,Country,HDI Rank,HDI,Life expectancy,Mean years of schooling,Gross national income (GNI) per capita,GNI per capita rank minus HDI rank,Change in HDI rank 2010-2015,Average annual HDI growth 1990-2000,...,Coefficient of human inequality,Inequality in life expectancy (%) 2010-2015,Inequality-adjusted life expectancy index,Inequality in education(%),Inequality-adjusted education index,Inequality in income (%),Inequality-adjusted income index,Income inequality (Quintile ratio) 2010-2015,Income inequality (Palma ratio) 2010-2015,Income inequality (Gini coefficient) 2010-2015
0,1,Norway,1.0,0.949,81.7,12.7,67614.0,5.0,0.0,0.77,...,5.4,3.3,0.918,2.4,0.894,10.4,0.882,3.8,0.9,25.9
1,2,Australia,2.0,0.939,82.5,13.2,42822.0,19.0,1.0,0.38,...,8.0,4.3,0.921,1.9,0.921,17.7,0.753,6.0,1.4,34.9
2,3,Switzerland,2.0,0.939,83.1,13.4,56364.0,7.0,0.0,0.67,...,8.4,3.8,0.934,5.7,0.84,15.7,0.806,4.9,1.2,31.6
3,4,Germany,4.0,0.926,81.1,13.2,45000.0,13.0,0.0,0.71,...,7.0,3.7,0.905,2.6,0.891,14.8,0.787,4.6,1.1,30.1
4,5,Denmark,5.0,0.925,80.4,12.7,44519.0,13.0,2.0,0.76,...,7.0,3.8,0.894,3.0,0.896,14.3,0.789,4.5,1.0,29.1


In [69]:
hdi.to_sql('human_dev', engine)

query = engine.execute("SELECT * FROM human_dev")

rows = query.fetchall()
columns = query.keys()

#for col in columns:
#    print(col)

df = pd.DataFrame(rows, columns=[col for col in columns])
df.head()

Unnamed: 0,index,Id,Country,HDI Rank,HDI,Life expectancy,Mean years of schooling,Gross national income (GNI) per capita,GNI per capita rank minus HDI rank,Change in HDI rank 2010-2015,...,Coefficient of human inequality,Inequality in life expectancy (%) 2010-2015,Inequality-adjusted life expectancy index,Inequality in education(%),Inequality-adjusted education index,Inequality in income (%),Inequality-adjusted income index,Income inequality (Quintile ratio) 2010-2015,Income inequality (Palma ratio) 2010-2015,Income inequality (Gini coefficient) 2010-2015
0,0,1,Norway,1.0,0.949,81.7,12.7,67614.0,5.0,0.0,...,5.4,3.3,0.918,2.4,0.894,10.4,0.882,3.8,0.9,25.9
1,1,2,Australia,2.0,0.939,82.5,13.2,42822.0,19.0,1.0,...,8.0,4.3,0.921,1.9,0.921,17.7,0.753,6.0,1.4,34.9
2,2,3,Switzerland,2.0,0.939,83.1,13.4,56364.0,7.0,0.0,...,8.4,3.8,0.934,5.7,0.84,15.7,0.806,4.9,1.2,31.6
3,3,4,Germany,4.0,0.926,81.1,13.2,45000.0,13.0,0.0,...,7.0,3.7,0.905,2.6,0.891,14.8,0.787,4.6,1.1,30.1
4,4,5,Denmark,5.0,0.925,80.4,12.7,44519.0,13.0,2.0,...,7.0,3.8,0.894,3.0,0.896,14.3,0.789,4.5,1.0,29.1


In [70]:
# be sure to run this
engine.dispose()