<a href="https://colab.research.google.com/github/anicelysantos/tutoriais-dados-realpython/blob/main/python_csv.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Esse tutorial pode ser encontrado aqui https://realpython.com/python-csv/

# Lendo e escrevendo arquivos CSV em Python

In [None]:
import csv

with open ('/content/drive/MyDrive/dados_pandas/Real Python/employee_birthday.csv') as csv_file:
  csv_reader = csv.reader(csv_file, delimiter=',')
  line_count = 0
  for row in csv_reader:
    if line_count == 0:
      print(f'Column names are {",".join(row)}')
      line_count += 1
    else:
      print(f'\t {row[0]} works in the {row[1]} department, and was born in {row[2]}.' )
      line_count += 1
  print(f'Processed {line_count} lines.')

Column names are name,department,birthday month
	 Jhon Smith works in the Accounting department, and was born in November.
	 Erica Meyers works in the IT department, and was born in March.
Processed 3 lines.


**Fazendo a mesma leitura usando dicionários**

In [None]:
import csv

with open ('/content/drive/MyDrive/dados_pandas/Real Python/employee_birthday.csv', mode='r') as csv_file:
  csv_reader = csv.DictReader(csv_file)
  line_count = 0
  for row in csv_reader:
    if line_count == 0:
      print(f'Column names are {",".join(row)}')
      line_count += 1
    print(f'\t{row["name"]} works in the {row["department"]} department, and was born in {row["birthday month"]}.')
    line_count += 1
  print(f'Processed {line_count} lines.')

Column names are name,department,birthday month
	Jhon Smith works in the Accounting department, and was born in November.
	Erica Meyers works in the IT department, and was born in March.
Processed 3 lines.


**Gravando arquivos CSV com `csv`**


In [None]:
import csv

with open('employee_file.csv', mode='w') as employee_file:
  employee_writer = csv.writer(employee_file, delimiter=',', quotechar='"', quoting = csv.QUOTE_MINIMAL)

  employee_writer.writerow(['John Smith', 'Accounting', 'November'])
  employee_writer.writerow(['Erica Meyers','IT','March'])

**Usando dicionários para fazer a mesma coisa**

In [None]:
import csv

with open('employee_file2.csv', mode='w') as csv_file:
  fieldnames = ['emp_name','dept','birth_month']
  writer = csv.DictWriter(csv_file, fieldnames=fieldnames)

  writer.writeheader()
  writer.writerow({'emp_name':'John Smith', 'dept':'Accounting','birth_month':'November'})
  writer.writerow({'emp_name':'Erica Meyers', 'dept':'IT','birth_month':'March'})


# Analisando arquivos CSV com a biblioteca Pandas

In [14]:
import pandas as pd


In [15]:
df = pd.read_csv('/content/drive/MyDrive/dados_pandas/Real Python/hrdata.csv')
print(df)

             name      Hire     Date  Sick Days remaining
0  Graham Chapman  03/15/14  50000.0                   10
1     John Cleese  06/01/15  65000.0                    8
2       Eric Idle  05/12/14  45000.0                   10
3     Terry Jones  11/01/13  70000.0                    3
4   Terry Gilliam  08/12/14  48000.0                    7
5   Michael Palin  05/23/13  66000.0                    8


In [17]:
print(type(df['Hire'][0]))

<class 'str'>


In [19]:
#Transformar uma coluna em índice
df = pd.read_csv('/content/drive/MyDrive/dados_pandas/Real Python/hrdata.csv', index_col='Name')
print(df)

                    Hire     Date  Sick Days remaining
name                                                  
Graham Chapman  03/15/14  50000.0                   10
John Cleese     06/01/15  65000.0                    8
Eric Idle       05/12/14  45000.0                   10
Terry Jones     11/01/13  70000.0                    3
Terry Gilliam   08/12/14  48000.0                    7
Michael Palin   05/23/13  66000.0                    8


In [21]:
#Mudar o formato da coluna data para date
df = pd.read_csv('/content/drive/MyDrive/dados_pandas/Real Python/hrdata.csv', index_col='Name', parse_dates=['Hire Date'])
print(df)

                Hire Date   Salary  Sick Days remaining
Name                                                   
Graham Chapman 2014-03-15  50000.0                   10
John Cleese    2015-06-01  65000.0                    8
Eric Idle      2014-05-12  45000.0                   10
Terry Jones    2013-11-01  70000.0                    3
Terry Gilliam  2014-08-12  48000.0                    7
Michael Palin  2013-05-23  66000.0                    8


In [22]:
print(type(df['Hire Date'][0]))

<class 'pandas._libs.tslibs.timestamps.Timestamp'>


In [23]:
df = pd.read_csv('/content/drive/MyDrive/dados_pandas/Real Python/hrdata.csv', index_col='Employee', parse_dates=['Hired'], header=0, names=['Employee','Hired','Salary','Sick Days'])
print(df)

                    Hired   Salary  Sick Days
Employee                                     
Graham Chapman 2014-03-15  50000.0         10
John Cleese    2015-06-01  65000.0          8
Eric Idle      2014-05-12  45000.0         10
Terry Jones    2013-11-01  70000.0          3
Terry Gilliam  2014-08-12  48000.0          7
Michael Palin  2013-05-23  66000.0          8


**Salvando um CSV**

In [24]:
df = pd.read_csv('/content/drive/MyDrive/dados_pandas/Real Python/hrdata.csv', index_col='Employee',parse_dates=['Hired'],header=0, names=['Employee','Hired','Salary','Sick Days'])

In [26]:
df.to_csv('hrdata_modified.csv')

In [27]:
df_m = pd.read_csv('/content/hrdata_modified.csv')
df_m

Unnamed: 0,Employee,Hired,Salary,Sick Days
0,Graham Chapman,2014-03-15,50000.0,10
1,John Cleese,2015-06-01,65000.0,8
2,Eric Idle,2014-05-12,45000.0,10
3,Terry Jones,2013-11-01,70000.0,3
4,Terry Gilliam,2014-08-12,48000.0,7
5,Michael Palin,2013-05-23,66000.0,8
