## Loading data into Pandas

There are different ways to load data into Pandas. The library is extremely flexible allowing you to work with different popular data formats. This notebook will show you how to load from different sources which include both local and remote.

### Create a Pandas Frame from an online CSV
Note that this requires a publicly available repository in Github (for this example). A private repo will require auth.

In [None]:
import pandas as pd
csv_url = "https://raw.githubusercontent.com/paiml/wine-ratings/main/wine-ratings.csv"
# set index_col to 0 to tell pandas that the first column is the index
df = pd.read_csv(csv_url, index_col=0)
df.head(10)

## Load a CSV from a local file

In [9]:
import pandas as pd
df = pd.read_csv("world-championship-qualifier.csv")
print(df)


    Rank                    Name     Nationality      Result    Notes Group
0    1.0           Svatoslav Ton  Czech Republic        2.14        q     A
1    1.0            Toni Huikuri        Finlandi        2.14        q     A
2    1.0          James Brierley  United Kingdom        2.14        q     A
3    1.0           Noriyasu Arai           Japan        2.14        q     A
4    5.0         Yannick Tregaro          Sweden        2.14        q     A
5    5.0       Dejan Vreljakovic              FR  Yugoslavia  2.14\tq     A
6    7.0            Alfredo Deza            Peru        2.10      NaN     A
7    8.0         Vagner Principe          Brazil        2.10      NaN     A
8    9.0  Alberto Juantorena Jr.            Cuba        2.10      NaN     A
9   10.0         Marcin Kaczocha          Poland        2.10      NaN     A
10  11.0         Andrey Krasulya         Ukraine        2.05      NaN     A
11  12.0            David Larsen   United States        2.05      NaN     A
12  13.0    

## Load JSON from a local file

In [10]:
df = pd.read_json("world-championship-qualifier.json")
df

Unnamed: 0,Rank,Name,Nationality,Result,Notes,Group
0,1.0,Svatoslav Ton,Czech Republic,2.14,q,A
1,1.0,Toni Huikuri,Finlandi,2.14,q,A
2,1.0,James Brierley,United Kingdom,2.14,q,A
3,1.0,Noriyasu Arai,Japan,2.14,q,A
4,5.0,Yannick Tregaro,Sweden,2.14,q,A
5,5.0,Dejan Vreljakovic,FR,Yugoslavia,2.14\tq,A
6,7.0,Alfredo Deza,Peru,2.10,,A
7,8.0,Vagner Principe,Brazil,2.10,,A
8,9.0,Alberto Juantorena Jr.,Cuba,2.10,,A
9,10.0,Marcin Kaczocha,Poland,2.10,,A


## You can read from many formats

The `pd` object allows you to read from various different formats including your clipboard!

- read_clipboard
- read_csv
- read_excel
- read_feather
- read_fwf
- read_gbq
- read_hdf
- read_html
- read_json
- read_orc
- read_parquet
- read_pickle
- read_sas
- read_spss
- read_sql
- read_sql_query
- read_sql_table
- read_stata
- read_table
- read_xml

In [11]:
# export to a dataset in HTML
df.to_html("dataset.html")

Copy/Paste to Another Format

pip install tabulate


In [17]:
pip install tabulate

Collecting tabulate
  Using cached tabulate-0.9.0-py3-none-any.whl (35 kB)
Installing collected packages: tabulate
Successfully installed tabulate-0.9.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.2.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip3 install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [18]:
from pandas.io.clipboards import to_clipboard

md = df.to_markdown()
to_clipboard(md,excel=False)

|    |   Rank | Name                   | Nationality    | Result     | Notes   | Group   |
|---:|-------:|:-----------------------|:---------------|:-----------|:--------|:--------|
|  0 |      1 | Svatoslav Ton          | Czech Republic | 2.14       | q       | A       |
|  1 |      1 | Toni Huikuri           | Finlandi       | 2.14       | q       | A       |
|  2 |      1 | James Brierley         | United Kingdom | 2.14       | q       | A       |
|  3 |      1 | Noriyasu Arai          | Japan          | 2.14       | q       | A       |
|  4 |      5 | Yannick Tregaro        | Sweden         | 2.14       | q       | A       |
|  5 |      5 | Dejan Vreljakovic      | FR             | Yugoslavia | 2.14	q         | A       |
|  6 |      7 | Alfredo Deza           | Peru           | 2.10       |         | A       |
|  7 |      8 | Vagner Principe        | Brazil         | 2.10       |         | A       |
|  8 |      9 | Alberto Juantorena Jr. | Cuba           | 2.10       |         | A       |
|  9 |     10 | Marcin Kaczocha        | Poland         | 2.10       |         | A       |
| 10 |     11 | Andrey Krasulya        | Ukraine        | 2.05       |         | A       |
| 11 |     12 | David Larsen           | United States  | 2.05       |         | A       |
| 12 |     13 | Ronald Garlett         | Australia      | 2.00       |         | A       |
| 13 |     13 | Oleg Prokopov          | Belarus        | 2.00       |         | A       |
| 14 |     15 | Felipe Apablaza        | Chile          | 2.00       |         | A       |
| 15 |     16 | Luis Soto Caballero    | Puerto Rico    | 2.00       |         | A       |
| 16 |    nan | Zoltán Akacz           | Hungary        | NH         |         | A       |
| 17 |      1 | Mark Boswell           | Canada         | 2.14       | q       | B       |
| 18 |      1 | Ben Challenger         | United Kingdom | 2.14       | q       | B       |
| 19 |      1 | Roman Fricke           | Germany        | 2.14       | q       | B       |
| 20 |      1 | Tivadar Kovács         | Hungary        | 2.14       | q       | B       |
| 21 |      1 | Dave Furman            | United States  | 2.14       | q       | B       |
| 22 |      6 | François Potgieter     | South Africa   | 2.14       | q       | B       |
| 23 |      7 | Sauli Niemi            | Finland        | 2.10       |         | B       |
| 24 |      7 | Katsuyoshi Miyamichi   | Japan          | 2.10       |         | B       |
| 25 |      9 | Marat Rakipov          | Russia         | 2.10       |         | B       |
| 26 |     10 | Adi Mordel             | Israel         | 2.10       |         | B       |
| 27 |     11 | Fabrício Romero        | Brazil         | 2.10       |         | B       |
| 28 |     12 | Abderahmane Hammad     | Algeria        | 2.05       |         | B       |
| 29 |     12 | Luke Temme             | Australia      | 2.05       |         | B       |
| 30 |     14 | Aleksey Lesnichiy      | Belarus        | 2.05       |         | B       |
| 31 |     15 | Ha Chung-Soo           | South Korea    | 2.00       |         | B       |
| 32 |     16 | Dejan Dokleja          | Croatia        | 2.00       |         | B       |
| 33 |    nan | Fawzi Warsame          | Somalia        | NH         |         | B       |