# Visión general

Pandas tiene una API de opciones que configura y personaliza el comportamiento global relacionado con la visualización de DataFrame, el comportamiento de los datos y más. Las opciones tienen un nombre completo de "estilo punteado" que no distingue entre mayúsculas y minúsculas (por ejemplo, display.max_rows). Puede obtener/establecer opciones directamente como atributos del atributo de opciones de nivel superior

In [1]:
import numpy as np
import pandas as pd

### La API se compone de 5 funciones relevantes, disponibles directamente desde el espacio de nombres de pandas:

#### 1. get_option() / set_option() - obtiene/establece el valor de una sola opción.
#### 2. reset_option(): restablece una o más opciones a su valor predeterminado.
#### 3. describe_option() - imprime las descripciones de una o más opciones.
#### 4. option_context(): ejecuta un bloque de código con un conjunto de opciones que vuelven a la configuración anterior después de la ejecución. Tratamiento global relacionado con la visualización de DataFrame, el comportamiento de los datos y más.

Las opciones tienen un nombre completo de "estilo punteado" que no distingue entre mayúsculas y minúsculas (por ejemplo, display.max_rows). Puede obtener/establecer opciones directamente como atributos del atributo de opciones de nivel superior:

### Ejemplo - Desplegar un número máximo de filas

In [2]:
pd.options.display.max_rows

60

In [3]:
pd.options.display.max_rows = 999

In [6]:
# La opción la definimos de la forma siguiente
pd.options.display.max_rows = 999

In [7]:
pd.options.display.max_rows

999

## 1. Opciones disponibles - describe_option()

In [8]:
# Puede obtener una lista de las opciones disponibles y sus descripciones con describe_option(). 
# Cuando se llama sin argumento, describe_option() imprimirá las descripciones de todas las opciones disponibles.
pd.describe_option()

compute.use_bottleneck : bool
    Use the bottleneck library to accelerate if it is installed,
    the default is True
    Valid values: False,True
    [default: True] [currently: True]
compute.use_numba : bool
    Use the numba engine option for select operations if it is installed,
    the default is False
    Valid values: False,True
    [default: False] [currently: False]
compute.use_numexpr : bool
    Use the numexpr library to accelerate computation if it is installed,
    the default is True
    Valid values: False,True
    [default: True] [currently: True]
display.chop_threshold : float or None
    if set to a float value, all float values smaller then the given threshold
    will be displayed as exactly 0 by repr and friends.
    [default: None] [currently: None]
display.colheader_justify : 'left'/'right'
    Controls the justification of column headers. used by DataFrameFormatter.
    [default: right] [currently: right]
display.column_space No description available.
    [defa

## 2. Obtener y configurar opciones - get_option(), set_option() y reset_option()

In [9]:
# Como se describió anteriormente, get_option() y set_option() están disponibles en el espacio de nombres pandas. 
# Para cambiar una opción, llama a set_option('option regex', new_value).
pd.get_option("mode.sim_interactive")

False

In [10]:
pd.set_option("mode.sim_interactive", True)

In [11]:
pd.get_option("mode.sim_interactive")

True

In [12]:
# Puede usar reset_option() para volver al valor predeterminado de una configuración
pd.get_option("display.max_rows")

999

In [13]:
pd.set_option("display.max_rows", 999)

In [14]:
pd.get_option("display.max_rows")

999

In [15]:
pd.reset_option("display.max_rows")

In [16]:
pd.get_option("display.max_rows")

60

In [17]:
# El administrador de contexto option_context() ha sido expuesto a través de la API de nivel superior, 
# lo que le permite ejecutar código con valores de opción dados.
with pd.option_context("display.max_rows", 10, "display.max_columns", 5):
    print(pd.get_option("display.max_rows"))
    print(pd.get_option("display.max_columns"))

10
5


In [18]:
print(pd.get_option("display.max_rows"))

60


In [19]:
print(pd.get_option("display.max_columns"))

20


## 3. Opciones y configuraciones más frecuentes

### 3.1 display.max_rows and display.max_columns

In [20]:
# Establece el número máximo de filas y columnas que se muestran cuando un marco está bastante impreso.
df = pd.DataFrame(np.random.randn(7, 2))

In [21]:
pd.set_option("display.max_rows", 5)

In [22]:
df

Unnamed: 0,0,1
0,-2.639528,-0.753343
1,0.118827,-2.251023
...,...,...
5,0.740710,-0.068280
6,0.901510,-1.039730


In [23]:
# Para resetear la configuración debemos utilizar pd.reset_option
pd.reset_option("display.max_rows")

In [24]:
# Una vez que se excede display.max_rows, las opciones de display.min_rows determinan 
# cuántas filas se muestran en la representación truncada.
pd.set_option("display.max_rows", 8)

In [25]:
pd.set_option("display.min_rows", 4)

In [26]:
# debajo de max_rows -> todas las filas mostradas
df = pd.DataFrame(np.random.randn(7, 2))

In [27]:
df

Unnamed: 0,0,1
0,1.41286,-0.903962
1,-0.435356,0.370168
2,1.276567,0.996586
3,-0.295896,1.132906
4,0.473863,-2.451191
5,1.926982,-1.253088
6,0.681022,0.206809


In [28]:
# Por encima de max_rows -> solo se muestran min_rows (4) filas
df = pd.DataFrame(np.random.randn(9, 2))

In [29]:
df

Unnamed: 0,0,1
0,0.209959,-1.592044
1,0.827556,-0.693740
...,...,...
7,-0.356346,1.731901
8,0.710769,0.071224


In [30]:
pd.reset_option("display.max_rows")

In [31]:
pd.reset_option("display.min_rows")

### 3.2 display.expand_frame_repr (true)

In [32]:
# Permite que la representación de un DataFrame se extienda a lo largo de las páginas, cubriendo todas las columnas.
df = pd.DataFrame(np.random.randn(5, 10))

In [33]:
pd.set_option("expand_frame_repr", True)

In [35]:
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,2.083507,1.369609,-1.121842,1.467991,1.07495,-0.101868,0.120729,-0.868776,0.669934,0.245437
1,-0.677856,-0.686099,-0.936928,0.184865,1.363541,-0.298171,0.054923,0.100713,-0.575025,1.809973
2,0.839928,-1.644121,-0.166748,0.823678,-0.468565,-1.433297,0.675336,-1.910801,0.156736,-0.039951
3,1.587995,0.248818,-0.457177,1.876711,-0.273752,-0.968781,1.438568,-1.992673,1.287518,0.527834
4,0.149784,-0.449601,-1.479222,0.273493,0.541639,-1.313014,0.150838,-0.464667,0.015708,-1.042608


In [36]:
pd.set_option("expand_frame_repr", False)

In [37]:
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,2.083507,1.369609,-1.121842,1.467991,1.07495,-0.101868,0.120729,-0.868776,0.669934,0.245437
1,-0.677856,-0.686099,-0.936928,0.184865,1.363541,-0.298171,0.054923,0.100713,-0.575025,1.809973
2,0.839928,-1.644121,-0.166748,0.823678,-0.468565,-1.433297,0.675336,-1.910801,0.156736,-0.039951
3,1.587995,0.248818,-0.457177,1.876711,-0.273752,-0.968781,1.438568,-1.992673,1.287518,0.527834
4,0.149784,-0.449601,-1.479222,0.273493,0.541639,-1.313014,0.150838,-0.464667,0.015708,-1.042608


In [38]:
pd.reset_option("expand_frame_repr")

### 3.3 display.large_repr

In [40]:
# Muestra un DataFrame que excede max_columns o max_rows como un marco truncado o un resumen.
df = pd.DataFrame(np.random.randn(10, 10))

In [41]:
pd.set_option("display.max_rows", 5)

In [42]:
pd.set_option("large_repr", "truncate")

In [43]:
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,-0.756200,-1.583374,-0.937080,-1.737756,-0.126661,1.204784,1.004779,0.694934,-0.713546,-0.460600
1,-0.268712,0.729537,0.462077,-0.419470,0.589128,-0.620988,1.189341,-0.394800,1.121615,-0.956635
...,...,...,...,...,...,...,...,...,...,...
8,-0.995483,0.059397,0.791699,-0.700620,-0.660777,0.403856,0.262984,3.634474,-0.596965,1.735728
9,0.455736,-0.004107,-1.049484,0.397669,0.833004,1.196970,-0.495508,-1.449819,-0.583625,0.150141


In [44]:
pd.set_option("large_repr", "info")

In [45]:
df

In [46]:
pd.reset_option("large_repr")

In [47]:
pd.reset_option("display.max_rows")

### 3.4 display.max_colwidth - Cantidad de letras en el nombre de una columna

In [48]:
# Establece el ancho máximo de las columnas. Las celdas de esta longitud o más se truncarán con puntos suspensivos
df = pd.DataFrame(
    np.array(
        [
            ["foo", "bar", "bim", "uncomfortably long string"],
            ["horse", "cow", "banana", "apple"],
        ]
    )
)

In [49]:
pd.set_option("max_colwidth", 40)

In [50]:
df

Unnamed: 0,0,1,2,3
0,foo,bar,bim,uncomfortably long string
1,horse,cow,banana,apple


In [51]:
pd.set_option("max_colwidth", 6)

In [52]:
df

Unnamed: 0,0,1,2,3
0,foo,bar,bim,un...
1,horse,cow,ba...,apple


### 3.5 display.max_info_columns

In [54]:
# Establece un umbral para el número de columnas que se muestran al llamar a info().
df = pd.DataFrame(np.random.randn(10, 10))

In [55]:
pd.set_option("max_info_columns", 11)

In [57]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 10 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   0       10 non-null     float64
 1   1       10 non-null     float64
 2   2       10 non-null     float64
 3   3       10 non-null     float64
 4   4       10 non-null     float64
 5   5       10 non-null     float64
 6   6       10 non-null     float64
 7   7       10 non-null     float64
 8   8       10 non-null     float64
 9   9       10 non-null     float64
dtypes: float64(10)
memory usage: 928.0 bytes


In [58]:
pd.set_option("max_info_columns", 5)

In [59]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Columns: 10 entries, 0 to 9
dtypes: float64(10)
memory usage: 928.0 bytes


In [60]:
pd.reset_option("max_info_columns")

### 3.6 display.max_info_rows

In [61]:
# info() generalmente mostrará recuentos nulos para cada columna. Para un DataFrame grande, esto puede ser bastante lento. 
# max_info_rows y max_info_cols limitan esta comprobación nula a las filas y columnas especificadas, 
# respectivamente. El argumento de la palabra clave info() null_counts=True anulará este
df = pd.DataFrame(np.random.choice([0, 1, np.nan], size=(10, 10)))

In [62]:
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,
1,,0.0,,,0.0,,,0.0,1.0,0.0
2,0.0,0.0,,,,,,0.0,1.0,1.0
3,0.0,,,1.0,,0.0,,1.0,0.0,
4,0.0,1.0,,,1.0,0.0,,0.0,1.0,
5,0.0,0.0,0.0,,1.0,0.0,,,0.0,1.0
6,0.0,1.0,0.0,0.0,,0.0,0.0,1.0,0.0,0.0
7,,,,0.0,1.0,0.0,1.0,0.0,1.0,1.0
8,0.0,1.0,,,1.0,,0.0,,,1.0
9,0.0,1.0,,0.0,,,1.0,1.0,0.0,


In [63]:
pd.set_option("max_info_rows", 11)

In [65]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 10 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   0       8 non-null      float64
 1   1       8 non-null      float64
 2   2       3 non-null      float64
 3   3       5 non-null      float64
 4   4       6 non-null      float64
 5   5       6 non-null      float64
 6   6       5 non-null      float64
 7   7       8 non-null      float64
 8   8       9 non-null      float64
 9   9       6 non-null      float64
dtypes: float64(10)
memory usage: 928.0 bytes


In [66]:
pd.set_option("max_info_rows", 5)

In [67]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 10 columns):
 #   Column  Dtype  
---  ------  -----  
 0   0       float64
 1   1       float64
 2   2       float64
 3   3       float64
 4   4       float64
 5   5       float64
 6   6       float64
 7   7       float64
 8   8       float64
 9   9       float64
dtypes: float64(10)
memory usage: 928.0 bytes


In [68]:
pd.reset_option("max_info_rows")

### 3.7 display.precision - decimales

In [69]:
# Establece la precisión de visualización de salida en términos de lugares decimales.
df = pd.DataFrame(np.random.randn(5, 5))

In [70]:
pd.set_option("display.precision", 7)

In [72]:
pd.set_option("display.precision", 3)

In [73]:
df

Unnamed: 0,0,1,2,3,4
0,0.161,0.106,0.557,-0.152,-0.559
1,-0.806,-0.672,-0.643,2.225,0.037
2,1.214,-0.623,0.496,-2.109,0.176
3,-1.596,0.687,0.728,-1.663,-0.332
4,-0.234,1.274,0.152,0.359,-1.542


### 3.8 display.chop_threshold

In [74]:
# Establece el umbral de redondeo en cero cuando se muestra una serie o un marco de datos. 
# Esta configuración no cambia la precisión con la que se almacena el número.
df = pd.DataFrame(np.random.randn(6, 6))

In [75]:
pd.set_option("chop_threshold", 0)

In [76]:
df

Unnamed: 0,0,1,2,3,4,5
0,-0.468,1.831,0.131,-0.613,1.164,-0.167
1,0.817,0.498,-0.072,1.14,1.036,1.512
2,-0.841,0.21,0.092,0.035,-0.721,-0.834
3,1.029,0.447,-0.089,0.397,-0.042,0.292
4,-1.385,0.3,1.337,1.13,0.865,-0.245
5,-0.552,0.03,-0.949,0.587,0.419,-0.156


In [79]:
# Todo lo que sea menor a 0.5 quedara representado por cero en la tabla
pd.set_option("chop_threshold", 0.5)

In [80]:
df

Unnamed: 0,0,1,2,3,4,5
0,0.0,1.831,0.0,-0.613,1.164,0.0
1,0.817,0.0,0.0,1.14,1.036,1.512
2,-0.841,0.0,0.0,0.0,-0.721,-0.834
3,1.029,0.0,0.0,0.0,0.0,0.0
4,-1.385,0.0,1.337,1.13,0.865,0.0
5,-0.552,0.0,-0.949,0.587,0.0,0.0


In [81]:
pd.reset_option("chop_threshold")

### 3.9 display.colheader_justify  - Ubicación del nombre de la columna (left or right)

In [89]:
# Controla la justificación de los encabezados. Las opciones son 'derecha' e 'izquierda'.
df = pd.DataFrame(
    np.array([np.random.randn(6), np.random.randint(1, 9, 6) * 0.1, np.zeros(6)]).T,
    columns=["A", "B", "C"],
    dtype="float",
)

In [90]:
pd.set_option("colheader_justify", "right")

In [91]:
df

Unnamed: 0,A,B,C
0,0.759,0.3,0.0
1,0.581,0.4,0.0
2,0.789,0.5,0.0
3,-0.109,0.7,0.0
4,0.271,0.3,0.0
5,-2.023,0.4,0.0


In [92]:
pd.set_option("colheader_justify", "left")

In [93]:
df

Unnamed: 0,A,B,C
0,0.759,0.3,0.0
1,0.581,0.4,0.0
2,0.789,0.5,0.0
3,-0.109,0.7,0.0
4,0.271,0.3,0.0
5,-2.023,0.4,0.0
