# **01 Intro to importing data in Python**

# Reading Text File in Python

## 1. Usando `open` y `close` manualmente

```python
file = open("archivo.txt", "r", encoding="utf-8")  # "r" = modo lectura
contenido = file.read()
print(contenido)
file.close()  
```

## 2. Usando *Context Manager*

```python
with open("archivo.txt", "r", encoding="utf-8") as file:
    contenido = file.read()
    print(contenido)
# el archivo se cierra automáticamente al salir del bloque
```

In [16]:
filename = "../datasets/hunk_finn.txt"
file = open(filename, mode='w')  
#print(file.write("\nHOLA"))
file.close()

In [18]:
with open(filename, 'r') as file: 
    print(file.read())

You donâ€™t know about me, without you have read a book by the
name of The Adventures of Tom Sawyer; but that ainâ€™t no matter. That
book was made by Mr. Mark Twain, and he told the truth, mainly.
There was things which he stretched, but mainly he told the truth.
That is nothing. I never seen anybody but lied one time or another,
without it was Aunt Polly, or the widow, or maybe Mary. Aunt
Pollyâ€”Tomâ€™s Aunt Polly, she isâ€”and Mary, and the Widow Douglas
is all told about in that book, which is mostly a true book, with some
stretchers, as I said before.


In [40]:
moby_file = open("../datasets/moby_dick.txt", 'r')

In [41]:
type(moby_file)

_io.TextIOWrapper

In [33]:
moby_file.read()

"CHAPTER 1. Loomings.\n\nCall me Ishmael. Some years ago--never mind how long precisely--having\nlittle or no money in my purse, and nothing particular to interest me on\nshore, I thought I would sail about a little and see the watery part of\nthe world. It is a way I have of driving off the spleen and regulating\nthe circulation. Whenever I find myself growing grim about the mouth;\nwhenever it is a damp, drizzly November in my soul; whenever I find\nmyself involuntarily pausing before coffin warehouses, and bringing up\nthe rear of every funeral I meet; and especially whenever my hypos get\nsuch an upper hand of me, that it requires a strong moral principle to\nprevent me from deliberately stepping into the street, and methodically\nknocking people's hats off--then, I account it high time to get to sea\nas soon as I can. This is my substitute for pistol and ball. With a\nphilosophical flourish Cato throws himself upon his sword; I quietly\ntake to the ship. There is nothing surprisin

In [42]:
moby_file.readlines()

['CHAPTER 1. Loomings.\n',
 '\n',
 'Call me Ishmael. Some years ago--never mind how long precisely--having\n',
 'little or no money in my purse, and nothing particular to interest me on\n',
 'shore, I thought I would sail about a little and see the watery part of\n',
 'the world. It is a way I have of driving off the spleen and regulating\n',
 'the circulation. Whenever I find myself growing grim about the mouth;\n',
 'whenever it is a damp, drizzly November in my soul; whenever I find\n',
 'myself involuntarily pausing before coffin warehouses, and bringing up\n',
 'the rear of every funeral I meet; and especially whenever my hypos get\n',
 'such an upper hand of me, that it requires a strong moral principle to\n',
 'prevent me from deliberately stepping into the street, and methodically\n',
 "knocking people's hats off--then, I account it high time to get to sea\n",
 'as soon as I can. This is my substitute for pistol and ball. With a\n',
 'philosophical flourish Cato throws himself 

## Flat files in numpy

* `np.loadtxt(filename, delimiter=',', skiprows=1, dtype=str)`

  Carga datos de un archivo de texto en un array de NumPy. Permite especificar delimitadores, saltar filas, y definir el tipo de dato.

* `np.genfromtxt(filename, delimiter=',', skip_header=1, dtype=None)`

  Similar a `loadtxt`, pero más robusto con datos faltantes, ya que puede manejar valores vacíos o no numéricos.

---

## Other file types

### ExcelFile

* `pd.ExcelFile('filename.xlsx')`
  Permite abrir un archivo de Excel para inspeccionarlo y luego cargar sus hojas en un DataFrame.

* `xls.sheet_names`
  Lista las hojas disponibles en el archivo Excel abierto con `ExcelFile`.

* `xls.parse('Sheet1')`
  Carga una hoja específica de Excel a un DataFrame.

### SAS/Stata

* `.`
* `from sas7bdat import SAS7BDAT`

```python
with SAS7BDAT()...
```
* `pd.read_stata()`

* HDF5 files
```python
import h5py
```

### Matlab

* `.`

```python
impor scipy.io

mat = scipy.io.loadmat(fileanme)
```


## Working with Databases

* Creating a db
```python
from sqlalchemy import create_engine

engine = create_engine('db connection')
table_names = engine.table_name()
```
* Querying relational dbs in Python
    * With connection
    ```python
        con = engine.connect()
        rs = con.execute("QUERY")
        df = pd.DataFrame(rs.fetchall())
        df.columns = rs.keys()
        con.close()
    ```
    * With context manager
    ```python
    with engine.connect() as con: 
        rs = con.execute("QUERY")
        df = pd.DataFrame(rs.fetchmany(100))
        df.columns = rs.keys()
    ```
    * Using pandas
    ```python
    df = pd.read_sql_query("QUERY", engine)
    ```
