<a href="https://colab.research.google.com/github/Cbautista80/20230812Claseseminario2/blob/master/Cuadernos/4_MediumData_Polars_y_Python2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ***Procesamiento de Medium Data en Python***

## ***Universidad Externado de Colombia***

>## ***Maestría en Inteligencia de Negocios***
![Imágen1](https://www.uexternado.edu.co/wp-content/uploads/2020/07/logo-uec.png)

jazaineam@unal.edu.co


>## ***Big Data.***
>## ***Docente: Antonino Zainea Maya.***

![](https://w7.pngwing.com/pngs/838/716/png-transparent-we-bare-bears-characters-polar-bear-giant-panda-grizzly-bear-mammal-polar-bear-white-animals-cat-like-mammal.png)

El término "Medium Data" no es tan comúnmente definido o utilizado en la industria de la tecnología de la información como lo son "Big Data" o "Small Data". Sin embargo, podemos entender "Medium Data" como un término que se sitúa en el medio de estos dos, tanto en términos de volumen de datos como de complejidad en su manejo y análisis. Aquí te explico más detalladamente:

### Concepto de Medium Data

- **Tamaño y Escala:** Los datos de tamaño medio son más grandes y complejos que los pequeños conjuntos de datos (como una hoja de cálculo con registros de clientes de una pequeña empresa), pero no alcanzan el volumen o la variedad de Big Data (como los generados por redes sociales a nivel global o sensores de IoT a gran escala).

- **Características:** Estos datos pueden incluir diversas fuentes y tipos, pero aún son manejables con herramientas y tecnologías de análisis de datos estándar. Pueden requerir cierto grado de procesamiento y capacidad de almacenamiento, pero no en la misma medida que el Big Data.

- **Escalabilidad:** Un desafío clave es la escalabilidad. A medida que una organización crece, sus datos pueden empezar a acercarse al umbral del Big Data, lo que requiere una reevaluación de las herramientas y estrategias de manejo de datos.



### ¿Qué es Pandas?

- **Biblioteca de Código Abierto:** Escrita en Python, proporciona estructuras de datos de alto rendimiento y herramientas de análisis.
- **Estructuras de Datos:** Ofrece dos estructuras principales, Series (unidimensional) y DataFrame (bidimensional).
- **Manipulación y Análisis de Datos:** Ideal para limpieza, transformación, agregación y visualización de datos.
- **Lectura y Escritura de Datos:** Soporta varios formatos como CSV, Excel, SQL, entre otros.
- **Uso de NumPy:** Tradicionalmente, ha utilizado NumPy para operaciones de bajo nivel y manipulación de arrays.

Pandas 2.0, lanzado el 3 de abril de 2023, representa tres años de desarrollo y trae novedades como una mejor integración con matrices de extensión y soporte para DataFrames en PyArrow. Además, introduce una resolución de fecha y hora que no se basa en nanosegundos y efectúa varios cambios en la API debido a la desaprobación forzada de ciertas características. La compatibilidad de código con la versión 2.0 depende de que no haya advertencias en la versión 1.5.3 o anteriores.

### Cambios y Novedades en Pandas 2.0

#### 1. **Adopción de Apache Arrow en lugar de NumPy:**

PyArrow representa una evolución significativa en Pandas 2.0, permitiendo un uso más eficiente de la memoria al procesar grandes conjuntos de datos. Tradicionalmente, Pandas se basaba en NumPy, que es efectivo pero puede ser ineficiente en memoria para conjuntos de datos grandes. PyArrow, construido sobre el formato de datos en columnas Apache Arrow, mejora este aspecto al proporcionar estructuras optimizadas para datos tabulares grandes que están diseñadas para ser rápidas y para minimizar el uso de memoria.

Con PyArrow, los usuarios de Pandas pueden esperar una menor huella de memoria y una mejora en el rendimiento general, lo que hace que Pandas sea más viable para trabajar con datos a gran escala que antes requerirían la transición a herramientas como Spark o Dask. Además, PyArrow facilita la interoperabilidad con otros sistemas de procesamiento de datos y formatos de almacenamiento, lo que contribuye a un ecosistema de datos más integrado y eficiente.

#### 2. Los tipos de datos que aceptan valores NULL ahora son posibles

Pandas 2.0 ha mejorado significativamente el manejo de valores nulos al introducir tipos de datos que aceptan valores NULL. En versiones anteriores, los tipos de datos de NumPy, como los enteros, no podían representar valores nulos, lo que llevaba a conversiones automáticas e indeseadas a tipos flotantes cuando se encontraban valores nulos en columnas enteras.

Con Pandas 1.0 se introdujeron tipos de datos anulables, pero su adopción requería esfuerzos adicionales por parte del usuario. Ahora, en la última versión, el manejo de valores nulos se ha simplificado mucho más. Al importar datos con `read_csv`, se puede utilizar el argumento `use_nullable_dtypes=True` para que las columnas se configuren automáticamente con tipos de datos que permiten valores nulos, eliminando las conversiones no deseadas y haciendo que el trabajo con datos faltantes sea más directo y menos propenso a errores.



#### 3. Mejora del rendimiento de copia en escritura

La técnica de "Copy-on-Write" (Copia en Escritura) en Pandas 2.0 es una estrategia de optimización de memoria que mejora el rendimiento y reduce el uso de memoria al manejar grandes conjuntos de datos. Funciona de manera similar a las operaciones diferidas en Spark, donde las operaciones se ejecutan solo cuando es necesario. Al crear una copia de un objeto de Pandas, como un DataFrame, se genera una referencia a los datos originales, y una nueva copia se crea solo si se hacen modificaciones. Esto minimiza las copias redundantes de datos y reduce el uso de memoria.

### 4. Tipos numéricos NumPy admitidos por índice
En Pandas 2.0, la funcionalidad de los índices se ha mejorado para soportar una gama más amplia de tipos numéricos de NumPy, incluyendo tipos de menor tamaño de bits como int8, int16, int32, entre otros. Anteriormente, solo se admitían tipos como int64, uint64 y float64. Esta actualización permite la creación de índices de menor tamaño, como los de 32 bits, en situaciones que antes generaban índices de 64 bits, mejorando así la eficiencia en términos de uso de memoria.

#### 5. Resolución que no es de nanosegundos en marcas de tiempo
Se ha mejorado la resolución de las marcas de tiempo, superando la anterior limitación de solo representarlas en nanosegundos. Ahora, se soportan resoluciones como segundos, milisegundos y microsegundos, permitiendo representar rangos de tiempo mucho más amplios, de hasta aproximadamente +/- 2.9e11 años. Esta mejora es especialmente útil para análisis de series temporales que abarcan periodos extensos, superando las restricciones de fecha anteriores.

#### 6. Formato de análisis coherente para fechas y horas

El proceso de análisis de fechas y horas con la función `to_datetime()` ha sido modificado para usar un formato consistente basado en el primer valor no nulo (NA). Antes, esta función determinaba el formato de cada elemento de forma independiente, lo cual podía ser problemático. Ahora, los usuarios también pueden especificar un formato particular si lo desean, y este formato especificado prevalecerá en el análisis.

Antes,
```python
ser = pd.Series(['13-01-2000', '12-01-2000'])
pd.to_datetime(ser)
Out[2]:
0   2000-01-13
1   2000-12-01
dtype: datetime64[ns]

```

Ahora,

```
ser = pd.Series(['13-01-2000', '12-01-2000'])

pd.to_datetime(ser)
Out[43]:
0   2000-01-13
1   2000-01-12
dtype: datetime64[ns]

```

Puedes ver los cambios adicionales acá https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html#backwards-incompatible-api-changes

In [1]:
import pandas as pd
pd.__version__

'1.5.3'

In [4]:
!pip install pandas==2.1.4

Collecting pandas==2.1.4
  Downloading pandas-2.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.3/12.3 MB[0m [31m46.5 MB/s[0m eta [36m0:00:00[0m
Collecting tzdata>=2022.1 (from pandas==2.1.4)
  Downloading tzdata-2023.4-py2.py3-none-any.whl (346 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m346.6/346.6 kB[0m [31m21.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tzdata, pandas
  Attempting uninstall: pandas
    Found existing installation: pandas 1.5.3
    Uninstalling pandas-1.5.3:
      Successfully uninstalled pandas-1.5.3
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requires kaleido, which is not installed.
lida 0.0.10 requires python-multipart, which

## Polars

Puedes ver más en [Cheat seet](https://franzdiebold.github.io/polars-cheat-sheet/Polars_cheat_sheet.pdf).

Polars combina la flexibilidad y facilidad de uso de Python con la velocidad y escalabilidad de Rust. Es rápido gracias a su núcleo escrito en Rust, un lenguaje eficiente en memoria con rendimiento comparable a C o C++. Polars puede utilizar todos los núcleos de CPU en paralelo y admite conjuntos de datos grandes. Su API intuitiva es fácil de usar para quienes conocen bibliotecas como Pandas. Además, utiliza Apache Arrow para ejecutar consultas vectorizadas y almacenamiento de datos columnar para un procesamiento en memoria rápido. Estas características lo hacen una biblioteca atractiva para el procesamiento de datos.

In [1]:
# instalación de polars
!pip install polars



In [2]:
import polars as pl
pl.__version__

'0.20.4'

In [16]:
!pip install polars==0.20.4

Collecting polars==0.20.4
  Downloading polars-0.20.4-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (28.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m28.6/28.6 MB[0m [31m26.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: polars
  Attempting uninstall: polars
    Found existing installation: polars 0.17.3
    Uninstalling polars-0.17.3:
      Successfully uninstalled polars-0.17.3
Successfully installed polars-0.20.4


Si la importación de Polars se realiza sin errores, significa que has instalado con éxito la versión básica de Polars. Esta instalación ligera te permite comenzar sin dependencias adicionales. Para acceder a las características más avanzadas de Polars, que incluyen la interacción con el ecosistema de Python y fuentes de datos externas, necesitas instalar Polars con banderas de características específicas. Por ejemplo, para convertir DataFrames de Polars a DataFrames de pandas y arrays de NumPy, debes instalar Polars con el comando correspondiente que incluya estas características.

In [3]:
pip install "polars[numpy, pandas]"



Este comando instala el núcleo de Polars junto con la funcionalidad necesaria para convertir DataFrames de Polars a objetos de pandas y NumPy. La lista completa de dependencias opcionales que se pueden instalar con Polars está disponible en la documentación de Polars. Alternativamente, para obtener todas las características, se puede instalar Polars con todas las dependencias opcionales usando el comando:
```
pip install "polars[all]"
```

#### Crear y leer DataFrames

In [7]:
df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": [0.3, 0.7, 0.1, 0.9, 0.6],
        "groups": ["A", "A", "B", "C", "B"],
    }
)

In [8]:
df

nrs,names,random,groups
i64,str,f64,str
1.0,"""foo""",0.3,"""A"""
2.0,"""ham""",0.7,"""A"""
3.0,"""spam""",0.1,"""B"""
,"""egg""",0.9,"""C"""
5.0,,0.6,"""B"""


In [9]:
#leer csv
df = pl.read_csv("https://j.mp/iriscsv", has_header=True)
df.head()

sepal_length,sepal_width,petal_length,petal_width,species
f64,f64,f64,f64,str
5.1,3.5,1.4,0.2,"""setosa"""
4.9,3.0,1.4,0.2,"""setosa"""
4.7,3.2,1.3,0.2,"""setosa"""
4.6,3.1,1.5,0.2,"""setosa"""
5.0,3.6,1.4,0.2,"""setosa"""


In [4]:
%%capture
pip install sodapy

In [5]:
import pandas as pd
from sodapy import Socrata


client = Socrata("www.datos.gov.co", None)
results = client.get("jbjy-vk9h", limit=100000)



In [6]:
results

[{'nombre_entidad': 'SECRETARIA DE INFRAESTRUCTURA - GOBERNACION DEL VALLE',
  'nit_entidad': '8903990295',
  'departamento': 'Valle del Cauca',
  'ciudad': 'Cali',
  'localizaci_n': 'Colombia,  Valle del Cauca ,  Cali',
  'orden': 'Territorial',
  'sector': 'Servicio Público',
  'rama': 'Ejecutivo',
  'entidad_centralizada': 'Centralizada',
  'proceso_de_compra': 'CO1.BDOS.2164888',
  'id_contrato': 'CO1.PCCNTR.2754764',
  'referencia_del_contrato': '1.310.02-59.2-0450',
  'estado_contrato': 'En ejecución',
  'codigo_de_categoria_principal': 'V1.72141003',
  'descripcion_del_proceso': 'Prestación de Servicios Profesionales para realizar el apoyo al proyecto Desarrollo de la Infraestructura del Transporte a cargo del departamento del Valle del Cauca.',
  'tipo_de_contrato': 'Prestación de servicios',
  'modalidad_de_contratacion': 'Contratación directa',
  'justificacion_modalidad_de': 'Servicios profesionales y apoyo a la gestión',
  'fecha_de_firma': '2021-08-12T00:00:00.000',
  'fec

In [7]:
datos_pagina = pl.DataFrame(results,infer_schema_length=0)
datos_pagina.head()

nombre_entidad,nit_entidad,departamento,ciudad,localizaci_n,orden,sector,rama,entidad_centralizada,proceso_de_compra,id_contrato,referencia_del_contrato,estado_contrato,codigo_de_categoria_principal,descripcion_del_proceso,tipo_de_contrato,modalidad_de_contratacion,justificacion_modalidad_de,fecha_de_firma,fecha_de_inicio_del_contrato,fecha_de_fin_del_contrato,condiciones_de_entrega,tipodocproveedor,documento_proveedor,proveedor_adjudicado,es_grupo,es_pyme,habilita_pago_adelantado,liquidaci_n,obligaci_n_ambiental,obligaciones_postconsumo,reversion,origen_de_los_recursos,destino_gasto,valor_del_contrato,valor_de_pago_adelantado,valor_facturado,valor_pendiente_de_pago,valor_pagado,valor_amortizado,valor_pendiente_de,valor_pendiente_de_ejecucion,estado_bpin,c_digo_bpin,anno_bpin,saldo_cdp,saldo_vigencia,espostconflicto,dias_adicionados,puntos_del_acuerdo,pilares_del_acuerdo,urlproceso,nombre_representante_legal,nacionalidad_representante_legal,domicilio_representante_legal,tipo_de_identificaci_n_representante_legal,identificaci_n_representante_legal,g_nero_representante_legal,presupuesto_general_de_la_nacion_pgn,sistema_general_de_participaciones,sistema_general_de_regal_as,recursos_propios_alcald_as_gobernaciones_y_resguardos_ind_genas_,recursos_de_credito,recursos_propios,codigo_entidad,codigo_proveedor,objeto_del_contrato
str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,struct[1],str,str,str,str,str,str,str,str,str,str,str,str,str,str,str
"""SECRETARIA DE …","""8903990295""","""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.21648…","""CO1.PCCNTR.275…","""1.310.02-59.2-…","""En ejecución""","""V1.72141003""","""Prestación de …","""Prestación de …","""Contratación d…","""Servicios prof…","""2021-08-12T00:…","""2021-08-17T00:…","""2021-12-31T00:…","""A convenir""","""Cédula de Ciud…","""34317568""","""Victoria Eugen…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""","""16500000""","""0""","""16500000""","""0""","""16500000""","""0""","""0""","""0""","""Válido""","""2020003760195""","""2023""","""1536500000""","""0""","""No""","""0""","""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.2168642&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Victoria Eugen…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""","""0""","""0""","""0""","""16500000""","""0""","""0""","""709412027""","""710837337""","""Prestación de …"
"""INSTITUCIÓN UN…","""890980134""","""Antioquia""","""Medellín""","""Colombia, Ant…","""Territorial""","""Educación Naci…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.30981…","""CO1.PCCNTR.387…","""CMA-CD-9797-94…","""En ejecución""","""V1.80111600""","""El Contratista…","""Prestación de …","""Contratación d…","""Servicios prof…","""2022-08-03T00:…","""2022-08-04T00:…","""2023-01-01T00:…","""No Definido""","""Cédula de Ciud…","""8356403""","""GUSTAVO ADOLFO…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…","""10961550""","""0""","""10961550""","""0""","""10961550""","""0""","""0""","""0""","""No Válido""","""No Definido""","""No D""","""4262184450""","""0""","""No""","""0""","""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3100647&isFromPublicArea=True&isModal=true&asPopupView=true""}","""GUSTAVO ADOLFO…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""","""0""","""0""","""0""","""10961550""","""0""","""0""","""704629146""","""714239985""","""El Contratista…"
"""DISTRITO ESPEC…","""890102018""","""Atlántico""","""Barranquilla""","""Colombia, Atl…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.16947…","""CO1.PCCNTR.216…","""CD-57-2021-045…","""En ejecución""","""V1.80111600""","""PRESTACIÓN DE …","""Prestación de …","""Contratación d…","""Servicios prof…","""2021-01-28T00:…","""2021-02-02T00:…","""2022-01-01T00:…","""No Definido""","""Cédula de Ciud…","""72142696""","""ANGEL VICENTE …","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""","""20234250""","""0""","""20234250""","""0""","""20234250""","""0""","""0""","""0""","""Válido""","""2021080010005""","""2021""","""2002311000""","""0""","""No""","""0""","""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.1690930&isFromPublicArea=True&isModal=true&asPopupView=true""}","""ANGEL VICENTE …","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""","""0""","""0""","""0""","""20234250""","""0""","""0""","""702442096""","""709472161""","""PRESTACIÓN DE …"
"""OPERADORA DIST…","""901526664""","""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Territorial""","""Transporte""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.47437…","""CO1.PCCNTR.533…","""CO1.PCCNTR.533…","""En ejecución""","""V1.82151500""","""PRESTACIÓN DE …","""Prestación de …","""Contratación r…","""Regla aplicabl…","""2023-10-04T00:…","""2023-10-26T00:…","""2024-01-12T00:…","""A convenir""","""Cédula de Ciud…","""53065957""","""Xiomara Meliss…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…","""30500000""","""0""","""0""","""30500000""","""0""","""0""","""0""","""30500000""","""No Válido""","""No Definido""","""No D""","""30500000""","""0""","""No""","""0""","""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.4829726&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Xiomara Meliss…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""Femenino""","""0""","""0""","""0""","""30500000""","""0""","""0""","""715753901""","""708941216""","""PRESTACIÓN DE …"
"""COMANDO FAC""","""899999102""","""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Nacional""","""defensa""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.34296…","""CO1.PCCNTR.433…","""157-00-A-COFAC…","""Modificado""","""V1.78181800""","""INSPECCIONES M…","""Prestación de …","""Selección Abre…","""Defensa y segu…","""2022-12-23T00:…","""2022-12-27T00:…","""2023-12-21T00:…","""DAP - Entregad…","""No Definido""","""900346811""","""AGG MRO""","""No""","""Si""","""No""","""Si""","""No""","""No""","""No""","""Distribuido""","""Inversión""","""3708000000""","""0""","""1236000000""","""2472000000""","""1236000000""","""0""","""0""","""2472000000""","""No Válido""","""No Definido""","""No D""","""3708000000""","""0""","""No""","""20""","""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3524611&isFromPublicArea=True&isModal=true&asPopupView=true""}","""YOBANY ANDRES …","""CO""","""AVENIDA EL DOR…","""Sin Descripcio…","""Sin Descripcio…","""No Definido""","""3708000000""","""0""","""0""","""0""","""0""","""0""","""700409022""","""703192971""","""INSPECCIONES M…"


In [8]:
type(datos_pagina)

polars.dataframe.frame.DataFrame

#### Cambiar tipo de datos

In [9]:
formato_fecha = "%Y-%m-%dT%H:%M:%S.%f"
datos_pagina = datos_pagina.with_columns(pl.col("nit_entidad").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("fecha_de_firma").str.to_datetime(formato_fecha, strict=False))
datos_pagina = datos_pagina.with_columns(pl.col("fecha_de_inicio_del_contrato").str.to_datetime(formato_fecha, strict=False))
datos_pagina = datos_pagina.with_columns(pl.col("fecha_de_fin_del_contrato").str.to_datetime(formato_fecha, strict=False))
datos_pagina = datos_pagina.with_columns(pl.col("valor_del_contrato").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("valor_de_pago_adelantado").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("valor_facturado").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("valor_pendiente_de_pago").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("valor_pagado").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("valor_amortizado").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("valor_pendiente_de").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("valor_pendiente_de_ejecucion").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("saldo_cdp").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("saldo_vigencia").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("dias_adicionados").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("presupuesto_general_de_la_nacion_pgn").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("sistema_general_de_participaciones").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("sistema_general_de_regal_as").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("recursos_propios_alcald_as_gobernaciones_y_resguardos_ind_genas_").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("recursos_de_credito").cast(pl.Float32))
datos_pagina = datos_pagina.with_columns(pl.col("recursos_propios").cast(pl.Float32))




  datos_pagina = datos_pagina.with_columns(pl.col("fecha_de_firma").str.to_datetime(formato_fecha, strict=False))
  datos_pagina = datos_pagina.with_columns(pl.col("fecha_de_inicio_del_contrato").str.to_datetime(formato_fecha, strict=False))
  datos_pagina = datos_pagina.with_columns(pl.col("fecha_de_fin_del_contrato").str.to_datetime(formato_fecha, strict=False))


In [None]:
datos_pagina.dtypes

[String,
 Float32,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 Datetime(time_unit='ns', time_zone=None),
 Datetime(time_unit='ns', time_zone=None),
 Datetime(time_unit='ns', time_zone=None),
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 String,
 Float32,
 Float32,
 Float32,
 Float32,
 Float32,
 Float32,
 Float32,
 Float32,
 String,
 String,
 String,
 Float32,
 Float32,
 String,
 Float32,
 String,
 String,
 Struct({'url': String}),
 String,
 String,
 String,
 String,
 String,
 String,
 Float32,
 Float32,
 Float32,
 Float32,
 Float32,
 Float32,
 String,
 String,
 String]

#### Escritura en parquet

In [10]:
datos_pagina.write_parquet("c:\\Users\\nib1l\\Downloads\\datos.parquet")

#### Lectura de parquet

In [11]:
datos_pagina = pl.read_parquet("c:\\Users\\nib1l\\Downloads\\datos.parquet")
datos_pagina.head()

nombre_entidad,nit_entidad,departamento,ciudad,localizaci_n,orden,sector,rama,entidad_centralizada,proceso_de_compra,id_contrato,referencia_del_contrato,estado_contrato,codigo_de_categoria_principal,descripcion_del_proceso,tipo_de_contrato,modalidad_de_contratacion,justificacion_modalidad_de,fecha_de_firma,fecha_de_inicio_del_contrato,fecha_de_fin_del_contrato,condiciones_de_entrega,tipodocproveedor,documento_proveedor,proveedor_adjudicado,es_grupo,es_pyme,habilita_pago_adelantado,liquidaci_n,obligaci_n_ambiental,obligaciones_postconsumo,reversion,origen_de_los_recursos,destino_gasto,valor_del_contrato,valor_de_pago_adelantado,valor_facturado,valor_pendiente_de_pago,valor_pagado,valor_amortizado,valor_pendiente_de,valor_pendiente_de_ejecucion,estado_bpin,c_digo_bpin,anno_bpin,saldo_cdp,saldo_vigencia,espostconflicto,dias_adicionados,puntos_del_acuerdo,pilares_del_acuerdo,urlproceso,nombre_representante_legal,nacionalidad_representante_legal,domicilio_representante_legal,tipo_de_identificaci_n_representante_legal,identificaci_n_representante_legal,g_nero_representante_legal,presupuesto_general_de_la_nacion_pgn,sistema_general_de_participaciones,sistema_general_de_regal_as,recursos_propios_alcald_as_gobernaciones_y_resguardos_ind_genas_,recursos_de_credito,recursos_propios,codigo_entidad,codigo_proveedor,objeto_del_contrato
str,f32,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,datetime[ns],datetime[ns],datetime[ns],str,str,str,str,str,str,str,str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,f32,f32,str,str,str,f32,f32,str,f32,str,str,struct[1],str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,str,str,str
"""SECRETARIA DE …",8904000000.0,"""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.21648…","""CO1.PCCNTR.275…","""1.310.02-59.2-…","""En ejecución""","""V1.72141003""","""Prestación de …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-08-12 00:00:00,2021-08-17 00:00:00,2021-12-31 00:00:00,"""A convenir""","""Cédula de Ciud…","""34317568""","""Victoria Eugen…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",16500000.0,0.0,16500000.0,0.0,16500000.0,0.0,0.0,0.0,"""Válido""","""2020003760195""","""2023""",1536500000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.2168642&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Victoria Eugen…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,16500000.0,0.0,0.0,"""709412027""","""710837337""","""Prestación de …"
"""INSTITUCIÓN UN…",890980160.0,"""Antioquia""","""Medellín""","""Colombia, Ant…","""Territorial""","""Educación Naci…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.30981…","""CO1.PCCNTR.387…","""CMA-CD-9797-94…","""En ejecución""","""V1.80111600""","""El Contratista…","""Prestación de …","""Contratación d…","""Servicios prof…",2022-08-03 00:00:00,2022-08-04 00:00:00,2023-01-01 00:00:00,"""No Definido""","""Cédula de Ciud…","""8356403""","""GUSTAVO ADOLFO…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…",10961550.0,0.0,10961550.0,0.0,10961550.0,0.0,0.0,0.0,"""No Válido""","""No Definido""","""No D""",4262200000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3100647&isFromPublicArea=True&isModal=true&asPopupView=true""}","""GUSTAVO ADOLFO…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,10961550.0,0.0,0.0,"""704629146""","""714239985""","""El Contratista…"
"""DISTRITO ESPEC…",890102016.0,"""Atlántico""","""Barranquilla""","""Colombia, Atl…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.16947…","""CO1.PCCNTR.216…","""CD-57-2021-045…","""En ejecución""","""V1.80111600""","""PRESTACIÓN DE …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-01-28 00:00:00,2021-02-02 00:00:00,2022-01-01 00:00:00,"""No Definido""","""Cédula de Ciud…","""72142696""","""ANGEL VICENTE …","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",20234250.0,0.0,20234250.0,0.0,20234250.0,0.0,0.0,0.0,"""Válido""","""2021080010005""","""2021""",2002300000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.1690930&isFromPublicArea=True&isModal=true&asPopupView=true""}","""ANGEL VICENTE …","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,20234250.0,0.0,0.0,"""702442096""","""709472161""","""PRESTACIÓN DE …"
"""OPERADORA DIST…",901526656.0,"""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Territorial""","""Transporte""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.47437…","""CO1.PCCNTR.533…","""CO1.PCCNTR.533…","""En ejecución""","""V1.82151500""","""PRESTACIÓN DE …","""Prestación de …","""Contratación r…","""Regla aplicabl…",2023-10-04 00:00:00,2023-10-26 00:00:00,2024-01-12 00:00:00,"""A convenir""","""Cédula de Ciud…","""53065957""","""Xiomara Meliss…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…",30500000.0,0.0,0.0,30500000.0,0.0,0.0,0.0,30500000.0,"""No Válido""","""No Definido""","""No D""",30500000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.4829726&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Xiomara Meliss…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""Femenino""",0.0,0.0,0.0,30500000.0,0.0,0.0,"""715753901""","""708941216""","""PRESTACIÓN DE …"
"""COMANDO FAC""",899999104.0,"""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Nacional""","""defensa""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.34296…","""CO1.PCCNTR.433…","""157-00-A-COFAC…","""Modificado""","""V1.78181800""","""INSPECCIONES M…","""Prestación de …","""Selección Abre…","""Defensa y segu…",2022-12-23 00:00:00,2022-12-27 00:00:00,2023-12-21 00:00:00,"""DAP - Entregad…","""No Definido""","""900346811""","""AGG MRO""","""No""","""Si""","""No""","""Si""","""No""","""No""","""No""","""Distribuido""","""Inversión""",3708000000.0,0.0,1236000000.0,2472000000.0,1236000000.0,0.0,0.0,2472000000.0,"""No Válido""","""No Definido""","""No D""",3708000000.0,0.0,"""No""",20.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3524611&isFromPublicArea=True&isModal=true&asPopupView=true""}","""YOBANY ANDRES …","""CO""","""AVENIDA EL DOR…","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",3708000000.0,0.0,0.0,0.0,0.0,0.0,"""700409022""","""703192971""","""INSPECCIONES M…"


#### Expresiones Polars
Las expresiones en Polars se pueden realizar en secuencia, lo que mejora la legibilidad del código. En este ejemplo, estamos filtrando filas y luego agrupando los resultados

In [None]:
# Filtrar filas donde la columna 'nrs' sea menor que 4 y luego agrupar por la columna 'groups' y sumar todas las columnas.
datos_pagina.filter(pl.col("orden") < 'Territorial').groupby("sector").agg(pl.col('valor_del_contrato').sum())


  datos_pagina.filter(pl.col("orden") < 'Territorial').groupby("sector").agg(pl.col('valor_del_contrato').sum())


sector,valor_del_contrato
str,f32
"""defensa""",1.3244e12
"""Cultura""",8.3392e10
"""Educación Naci…",2.2017e11
"""Minas y Energí…",1.2938e11
"""Ciencia Tecnol…",1.6644e10
"""No aplica/No p…",3.9165e11
"""Inteligencia E…",2.7438e9
"""agricultura""",1.3126e11
"""Relaciones Ext…",2.2643e10
"""Vivienda, Ciud…",1.7395e11


ALgunas de las funciones que son posibles usar son

| Función | Descripción |
|---------|-------------|
| `sum()` | Calcula la suma de los valores de la columna. |
| `mean()` | Calcula el promedio de los valores de la columna. |
| `min()` | Encuentra el valor mínimo en la columna. |
| `max()` | Encuentra el valor máximo en la columna. |
| `count()` | Cuenta el número de elementos en la columna. |
| `median()` | Calcula la mediana de los valores de la columna. |
| `std()` | Calcula la desviación estándar de los valores de la columna. |
| `var()` | Calcula la varianza de los valores de la columna. |
| `quantile(q)` | Calcula el cuantil (por ejemplo, mediana para `q=0.5`). |
| `first()` | Obtiene el primer valor de la columna en un grupo. |
| `last()` | Obtiene el último valor de la columna en un grupo. |
| `unique()` | Devuelve valores únicos de la columna. |
| `list()` | Agrega los valores de la columna en una lista (útil en agrupaciones). |
| `sort()` | Ordena los valores de la columna. |
| `apply(func)` | Aplica una función personalizada a los valores de la columna. |
| `is_null()` | Devuelve una máscara booleana indicando si los valores son nulos. |
| `is_not_null()` | Devuelve una máscara booleana indicando si los valores no son nulos. |
| `is_finite()` | Comprueba si los valores son finitos. |
| `is_infinite()` | Comprueba si los valores son infinitos. |
| `clip(min_val, max_val)` | Limita los valores a un rango definido. |
| `alias(new_name)` | Renombra la columna (útil en agrupaciones y selecciones). |


#### Múltiples filtros

In [None]:
datos_pagina.filter((pl.col('orden')=='Territorial') & (pl.col('ciudad')=='Cali')).head()

nombre_entidad,nit_entidad,departamento,ciudad,localizaci_n,orden,sector,rama,entidad_centralizada,proceso_de_compra,id_contrato,referencia_del_contrato,estado_contrato,codigo_de_categoria_principal,descripcion_del_proceso,tipo_de_contrato,modalidad_de_contratacion,justificacion_modalidad_de,fecha_de_firma,fecha_de_inicio_del_contrato,fecha_de_fin_del_contrato,condiciones_de_entrega,tipodocproveedor,documento_proveedor,proveedor_adjudicado,es_grupo,es_pyme,habilita_pago_adelantado,liquidaci_n,obligaci_n_ambiental,obligaciones_postconsumo,reversion,origen_de_los_recursos,destino_gasto,valor_del_contrato,valor_de_pago_adelantado,valor_facturado,valor_pendiente_de_pago,valor_pagado,valor_amortizado,valor_pendiente_de,valor_pendiente_de_ejecucion,estado_bpin,c_digo_bpin,anno_bpin,saldo_cdp,saldo_vigencia,espostconflicto,dias_adicionados,puntos_del_acuerdo,pilares_del_acuerdo,urlproceso,nombre_representante_legal,nacionalidad_representante_legal,domicilio_representante_legal,tipo_de_identificaci_n_representante_legal,identificaci_n_representante_legal,g_nero_representante_legal,presupuesto_general_de_la_nacion_pgn,sistema_general_de_participaciones,sistema_general_de_regal_as,recursos_propios_alcald_as_gobernaciones_y_resguardos_ind_genas_,recursos_de_credito,recursos_propios,codigo_entidad,codigo_proveedor,objeto_del_contrato
str,f32,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,datetime[ns],datetime[ns],datetime[ns],str,str,str,str,str,str,str,str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,f32,f32,str,str,str,f32,f32,str,f32,str,str,struct[1],str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,str,str,str
"""SECRETARIA DE …",8904000000.0,"""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.21648…","""CO1.PCCNTR.275…","""1.310.02-59.2-…","""En ejecución""","""V1.72141003""","""Prestación de …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-08-12 00:00:00,2021-08-17 00:00:00,2021-12-31 00:00:00,"""A convenir""","""Cédula de Ciud…","""34317568""","""Victoria Eugen…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",16500000.0,0.0,16500000.0,0.0,16500000.0,0.0,0.0,0.0,"""Válido""","""2020003760195""","""2023""",1536500000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.2168642&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Victoria Eugen…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,16500000.0,0.0,0.0,"""709412027""","""710837337""","""Prestación de …"
"""SANTIAGO DE CA…",890399040.0,"""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""No aplica/No p…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.21297…","""CO1.PCCNTR.271…","""4164.010.26.1.…","""Cerrado""","""V1.80111500""","""Prestación de …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-07-28 00:00:00,2021-07-29 00:00:00,2021-10-15 00:00:00,"""A convenir""","""Cédula de Ciud…","""1143847407""","""DANIELA JOANNA…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",9462000.0,0.0,0.0,9462000.0,0.0,0.0,0.0,9462000.0,"""Válido""","""2020760010096""","""2023""",9462000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.2131537&isFromPublicArea=True&isModal=true&asPopupView=true""}","""DANIELA JOANNA…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,9462000.0,0.0,0.0,"""702364142""","""709444681""","""Prestación de …"
"""INSTITUTO DEL …",805012864.0,"""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""deportes""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.21903…","""CO1.PCCNTR.278…","""IND-21-0898""","""Cerrado""","""V1.80111600""","""PRESTACIÓN DE …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-08-25 00:00:00,2021-08-26 00:00:00,2021-12-31 00:00:00,"""Como acordado …","""Cédula de Ciud…","""1006011630""","""MARIA CAMILA P…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",10000000.0,0.0,0.0,10000000.0,0.0,0.0,0.0,10000000.0,"""Válido""","""2021003760171""","""2021""",10000000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.2192943&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Maria Camila P…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""Femenino""",0.0,0.0,0.0,10000000.0,0.0,0.0,"""701715203""","""713663086""","""PRESTACIÓN DE …"
"""SANTIAGO DE CA…",890399040.0,"""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""No aplica/No p…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.25641…","""CO1.PCCNTR.324…","""4163.001.26.1.…","""Cerrado""","""V1.80111501""","""Prestación de …","""Prestación de …","""Contratación d…","""Servicios prof…",2022-01-15 00:00:00,2022-01-18 00:00:00,2022-07-01 00:00:00,"""Como acordado …","""Cédula de Ciud…","""1130624927""","""gustavo adolfo…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",14580000.0,0.0,0.0,14580000.0,0.0,0.0,0.0,14580000.0,"""Válido""","""2020760010473""","""2022""",14580000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.2564729&isFromPublicArea=True&isModal=true&asPopupView=true""}","""gustavo adolfo…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,14580000.0,0.0,0.0,"""702435306""","""705626869""","""Prestación de …"
"""SANTIAGO DE CA…",890399040.0,"""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""No aplica/No p…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.16939…","""CO1.PCCNTR.217…","""4137.010.26.1.…","""Cerrado""","""V1.80111501""","""Prestar los se…","""Prestación de …","""Contratación d…","""Servicios prof…",2021-01-27 00:00:00,2021-01-28 00:00:00,2021-07-01 00:00:00,"""Como acordado …","""Cédula de Ciud…","""14465932""","""Jose Francisco…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",30324000.0,0.0,0.0,30324000.0,0.0,0.0,0.0,30324000.0,"""Válido""","""2020760010112""","""2021""",30324000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.1697531&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Jose Francisco…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,30324000.0,0.0,0.0,"""704025808""","""705626778""","""Prestar los se…"


Por supuesto, aquí tienes la tabla actualizada:

| Operador | Descripción                                     |
|----------|-------------------------------------------------|
| `&`      | Operador lógico `AND` (Y lógico). Combina dos o más condiciones y todas deben ser verdaderas. |
| `\|`      | Operador lógico `OR` (O lógico). Combina dos o más condiciones y al menos una debe ser verdadera. |
| `~`      | Operador lógico `NOT` (NO lógico). Niega una condición, invirtiendo su resultado. |

Si tienes alguna otra pregunta o necesitas más información, no dudes en preguntar.

#### Seleccion de columnas

In [None]:
datos_pagina.select(['departamento','ciudad','valor_del_contrato']).head()

departamento,ciudad,valor_del_contrato
str,str,f32
"""Valle del Cauc…","""Cali""",16500000.0
"""Antioquia""","""Medellín""",10961550.0
"""Atlántico""","""Barranquilla""",20234250.0
"""Distrito Capit…","""Bogotá""",30500000.0
"""Distrito Capit…","""Bogotá""",3708000000.0


In [None]:
datos_pagina.select(pl.col("^n.*l$")).head()

nombre_representante_legal,nacionalidad_representante_legal
str,str
"""Victoria Eugen…","""CO"""
"""GUSTAVO ADOLFO…","""CO"""
"""ANGEL VICENTE …","""CO"""
"""Xiomara Meliss…","""CO"""
"""YOBANY ANDRES …","""CO"""


### ¿Qué son las expresiones regulares (regex)?

Las expresiones regulares, o regex, son patrones de búsqueda utilizados para coincidir y manipular cadenas de texto. Son extremadamente útiles para realizar tareas como búsqueda, extracción y validación de datos basados en patrones.

### Sintaxis básica de regex:

- `.`: Coincide con cualquier carácter, excepto un salto de línea.
- `*`: Coincide con cero o más repeticiones del carácter anterior.
- `+`: Coincide con una o más repeticiones del carácter anterior.
- `?`: Coincide con cero o una repetición del carácter anterior.
- `\`: Escapa un carácter especial, permitiendo que se trate literalmente. Por ejemplo, `\.` coincidirá con un punto literal.
- `[]`: Define un conjunto de caracteres. Por ejemplo, `[aeiou]` coincidirá con cualquier vocal.
- `[^]`: Define un conjunto de caracteres negados. Por ejemplo, `[^0-9]` coincidirá con cualquier carácter que no sea un dígito.
- `|`: Representa una opción. Por ejemplo, `a|b` coincidirá con "a" o "b".
- `()`: Agrupa elementos juntos. Por ejemplo, `(ab)+` coincidirá con "ab", "abab", etc.
- `^`: Coincide con el inicio de una línea o cadena.
- `$`: Coincide con el final de una línea o cadena.

### Ejemplos de uso:

1. **Búsqueda de correos electrónicos**:
   - Patrón: `[\w\.-]+@[\w\.-]+`
   - Significado: Coincide con direcciones de correo electrónico válidas.

2. **Búsqueda de números de teléfono**:
   - Patrón: `\d{3}-\d{2}-\d{4}`
   - Significado: Coincide con números de teléfono en formato XXX-XX-XXXX.

3. **Extracción de fechas**:
   - Patrón: `\d{2}/\d{2}/\d{4}`
   - Significado: Coincide con fechas en formato DD/MM/AAAA.

4. **Validación de contraseñas seguras**:
   - Patrón: `^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$`
   - Significado: Valida contraseñas que contengan al menos una minúscula, una mayúscula, un número y tengan al menos 8 caracteres de longitud.


### Ejercicio

1. Descargue la información de Secop integrado  de datos abiertos con el id `rpmr-utcd`
2. Generé el parquet e indique el tamaño del archivo
3. Realice la carga del archivo desde polars solamente seleccionando las variables Municipio Entidad, Estado proceso y Valor Contrato.
4. Filtre los municicipos que inician con 'Cal' tome en cuenta este formato `filter(pl.col("ciudad").str.contains("formato regex"))`
5. Agrupe por estado del proceso y realice el calculo del promedio del valor del contrato.


#### Manejo de nulos

In [None]:
datos_pagina.drop_nulls().head()

nombre_entidad,nit_entidad,departamento,ciudad,localizaci_n,orden,sector,rama,entidad_centralizada,proceso_de_compra,id_contrato,referencia_del_contrato,estado_contrato,codigo_de_categoria_principal,descripcion_del_proceso,tipo_de_contrato,modalidad_de_contratacion,justificacion_modalidad_de,fecha_de_firma,fecha_de_inicio_del_contrato,fecha_de_fin_del_contrato,condiciones_de_entrega,tipodocproveedor,documento_proveedor,proveedor_adjudicado,es_grupo,es_pyme,habilita_pago_adelantado,liquidaci_n,obligaci_n_ambiental,obligaciones_postconsumo,reversion,origen_de_los_recursos,destino_gasto,valor_del_contrato,valor_de_pago_adelantado,valor_facturado,valor_pendiente_de_pago,valor_pagado,valor_amortizado,valor_pendiente_de,valor_pendiente_de_ejecucion,estado_bpin,c_digo_bpin,anno_bpin,saldo_cdp,saldo_vigencia,espostconflicto,dias_adicionados,puntos_del_acuerdo,pilares_del_acuerdo,urlproceso,nombre_representante_legal,nacionalidad_representante_legal,domicilio_representante_legal,tipo_de_identificaci_n_representante_legal,identificaci_n_representante_legal,g_nero_representante_legal,presupuesto_general_de_la_nacion_pgn,sistema_general_de_participaciones,sistema_general_de_regal_as,recursos_propios_alcald_as_gobernaciones_y_resguardos_ind_genas_,recursos_de_credito,recursos_propios,codigo_entidad,codigo_proveedor,objeto_del_contrato
str,f32,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,datetime[ns],datetime[ns],datetime[ns],str,str,str,str,str,str,str,str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,f32,f32,str,str,str,f32,f32,str,f32,str,str,struct[1],str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,str,str,str
"""SECRETARIA DE …",8904000000.0,"""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.21648…","""CO1.PCCNTR.275…","""1.310.02-59.2-…","""En ejecución""","""V1.72141003""","""Prestación de …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-08-12 00:00:00,2021-08-17 00:00:00,2021-12-31 00:00:00,"""A convenir""","""Cédula de Ciud…","""34317568""","""Victoria Eugen…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",16500000.0,0.0,16500000.0,0.0,16500000.0,0.0,0.0,0.0,"""Válido""","""2020003760195""","""2023""",1536500000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.2168642&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Victoria Eugen…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,16500000.0,0.0,0.0,"""709412027""","""710837337""","""Prestación de …"
"""INSTITUCIÓN UN…",890980160.0,"""Antioquia""","""Medellín""","""Colombia, Ant…","""Territorial""","""Educación Naci…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.30981…","""CO1.PCCNTR.387…","""CMA-CD-9797-94…","""En ejecución""","""V1.80111600""","""El Contratista…","""Prestación de …","""Contratación d…","""Servicios prof…",2022-08-03 00:00:00,2022-08-04 00:00:00,2023-01-01 00:00:00,"""No Definido""","""Cédula de Ciud…","""8356403""","""GUSTAVO ADOLFO…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…",10961550.0,0.0,10961550.0,0.0,10961550.0,0.0,0.0,0.0,"""No Válido""","""No Definido""","""No D""",4262200000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3100647&isFromPublicArea=True&isModal=true&asPopupView=true""}","""GUSTAVO ADOLFO…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,10961550.0,0.0,0.0,"""704629146""","""714239985""","""El Contratista…"
"""DISTRITO ESPEC…",890102016.0,"""Atlántico""","""Barranquilla""","""Colombia, Atl…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.16947…","""CO1.PCCNTR.216…","""CD-57-2021-045…","""En ejecución""","""V1.80111600""","""PRESTACIÓN DE …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-01-28 00:00:00,2021-02-02 00:00:00,2022-01-01 00:00:00,"""No Definido""","""Cédula de Ciud…","""72142696""","""ANGEL VICENTE …","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",20234250.0,0.0,20234250.0,0.0,20234250.0,0.0,0.0,0.0,"""Válido""","""2021080010005""","""2021""",2002300000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.1690930&isFromPublicArea=True&isModal=true&asPopupView=true""}","""ANGEL VICENTE …","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,20234250.0,0.0,0.0,"""702442096""","""709472161""","""PRESTACIÓN DE …"
"""OPERADORA DIST…",901526656.0,"""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Territorial""","""Transporte""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.47437…","""CO1.PCCNTR.533…","""CO1.PCCNTR.533…","""En ejecución""","""V1.82151500""","""PRESTACIÓN DE …","""Prestación de …","""Contratación r…","""Regla aplicabl…",2023-10-04 00:00:00,2023-10-26 00:00:00,2024-01-12 00:00:00,"""A convenir""","""Cédula de Ciud…","""53065957""","""Xiomara Meliss…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…",30500000.0,0.0,0.0,30500000.0,0.0,0.0,0.0,30500000.0,"""No Válido""","""No Definido""","""No D""",30500000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.4829726&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Xiomara Meliss…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""Femenino""",0.0,0.0,0.0,30500000.0,0.0,0.0,"""715753901""","""708941216""","""PRESTACIÓN DE …"
"""COMANDO FAC""",899999104.0,"""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Nacional""","""defensa""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.34296…","""CO1.PCCNTR.433…","""157-00-A-COFAC…","""Modificado""","""V1.78181800""","""INSPECCIONES M…","""Prestación de …","""Selección Abre…","""Defensa y segu…",2022-12-23 00:00:00,2022-12-27 00:00:00,2023-12-21 00:00:00,"""DAP - Entregad…","""No Definido""","""900346811""","""AGG MRO""","""No""","""Si""","""No""","""Si""","""No""","""No""","""No""","""Distribuido""","""Inversión""",3708000000.0,0.0,1236000000.0,2472000000.0,1236000000.0,0.0,0.0,2472000000.0,"""No Válido""","""No Definido""","""No D""",3708000000.0,0.0,"""No""",20.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3524611&isFromPublicArea=True&isModal=true&asPopupView=true""}","""YOBANY ANDRES …","""CO""","""AVENIDA EL DOR…","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",3708000000.0,0.0,0.0,0.0,0.0,0.0,"""700409022""","""703192971""","""INSPECCIONES M…"


In [None]:
 # Eliminar filas con valores nulos
datos_pagina.fill_null(42).head()  # Reemplazar valores nulos con 42


nombre_entidad,nit_entidad,departamento,ciudad,localizaci_n,orden,sector,rama,entidad_centralizada,proceso_de_compra,id_contrato,referencia_del_contrato,estado_contrato,codigo_de_categoria_principal,descripcion_del_proceso,tipo_de_contrato,modalidad_de_contratacion,justificacion_modalidad_de,fecha_de_firma,fecha_de_inicio_del_contrato,fecha_de_fin_del_contrato,condiciones_de_entrega,tipodocproveedor,documento_proveedor,proveedor_adjudicado,es_grupo,es_pyme,habilita_pago_adelantado,liquidaci_n,obligaci_n_ambiental,obligaciones_postconsumo,reversion,origen_de_los_recursos,destino_gasto,valor_del_contrato,valor_de_pago_adelantado,valor_facturado,valor_pendiente_de_pago,valor_pagado,valor_amortizado,valor_pendiente_de,valor_pendiente_de_ejecucion,estado_bpin,c_digo_bpin,anno_bpin,saldo_cdp,saldo_vigencia,espostconflicto,dias_adicionados,puntos_del_acuerdo,pilares_del_acuerdo,urlproceso,nombre_representante_legal,nacionalidad_representante_legal,domicilio_representante_legal,tipo_de_identificaci_n_representante_legal,identificaci_n_representante_legal,g_nero_representante_legal,presupuesto_general_de_la_nacion_pgn,sistema_general_de_participaciones,sistema_general_de_regal_as,recursos_propios_alcald_as_gobernaciones_y_resguardos_ind_genas_,recursos_de_credito,recursos_propios,codigo_entidad,codigo_proveedor,objeto_del_contrato
str,f32,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,datetime[ns],datetime[ns],datetime[ns],str,str,str,str,str,str,str,str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,f32,f32,str,str,str,f32,f32,str,f32,str,str,struct[1],str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,str,str,str
"""SECRETARIA DE …",8904000000.0,"""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.21648…","""CO1.PCCNTR.275…","""1.310.02-59.2-…","""En ejecución""","""V1.72141003""","""Prestación de …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-08-12 00:00:00,2021-08-17 00:00:00,2021-12-31 00:00:00,"""A convenir""","""Cédula de Ciud…","""34317568""","""Victoria Eugen…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",16500000.0,0.0,16500000.0,0.0,16500000.0,0.0,0.0,0.0,"""Válido""","""2020003760195""","""2023""",1536500000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.2168642&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Victoria Eugen…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,16500000.0,0.0,0.0,"""709412027""","""710837337""","""Prestación de …"
"""INSTITUCIÓN UN…",890980160.0,"""Antioquia""","""Medellín""","""Colombia, Ant…","""Territorial""","""Educación Naci…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.30981…","""CO1.PCCNTR.387…","""CMA-CD-9797-94…","""En ejecución""","""V1.80111600""","""El Contratista…","""Prestación de …","""Contratación d…","""Servicios prof…",2022-08-03 00:00:00,2022-08-04 00:00:00,2023-01-01 00:00:00,"""No Definido""","""Cédula de Ciud…","""8356403""","""GUSTAVO ADOLFO…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…",10961550.0,0.0,10961550.0,0.0,10961550.0,0.0,0.0,0.0,"""No Válido""","""No Definido""","""No D""",4262200000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3100647&isFromPublicArea=True&isModal=true&asPopupView=true""}","""GUSTAVO ADOLFO…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,10961550.0,0.0,0.0,"""704629146""","""714239985""","""El Contratista…"
"""DISTRITO ESPEC…",890102016.0,"""Atlántico""","""Barranquilla""","""Colombia, Atl…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.16947…","""CO1.PCCNTR.216…","""CD-57-2021-045…","""En ejecución""","""V1.80111600""","""PRESTACIÓN DE …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-01-28 00:00:00,2021-02-02 00:00:00,2022-01-01 00:00:00,"""No Definido""","""Cédula de Ciud…","""72142696""","""ANGEL VICENTE …","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",20234250.0,0.0,20234250.0,0.0,20234250.0,0.0,0.0,0.0,"""Válido""","""2021080010005""","""2021""",2002300000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.1690930&isFromPublicArea=True&isModal=true&asPopupView=true""}","""ANGEL VICENTE …","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,20234250.0,0.0,0.0,"""702442096""","""709472161""","""PRESTACIÓN DE …"
"""OPERADORA DIST…",901526656.0,"""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Territorial""","""Transporte""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.47437…","""CO1.PCCNTR.533…","""CO1.PCCNTR.533…","""En ejecución""","""V1.82151500""","""PRESTACIÓN DE …","""Prestación de …","""Contratación r…","""Regla aplicabl…",2023-10-04 00:00:00,2023-10-26 00:00:00,2024-01-12 00:00:00,"""A convenir""","""Cédula de Ciud…","""53065957""","""Xiomara Meliss…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…",30500000.0,0.0,0.0,30500000.0,0.0,0.0,0.0,30500000.0,"""No Válido""","""No Definido""","""No D""",30500000.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.4829726&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Xiomara Meliss…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""Femenino""",0.0,0.0,0.0,30500000.0,0.0,0.0,"""715753901""","""708941216""","""PRESTACIÓN DE …"
"""COMANDO FAC""",899999104.0,"""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Nacional""","""defensa""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.34296…","""CO1.PCCNTR.433…","""157-00-A-COFAC…","""Modificado""","""V1.78181800""","""INSPECCIONES M…","""Prestación de …","""Selección Abre…","""Defensa y segu…",2022-12-23 00:00:00,2022-12-27 00:00:00,2023-12-21 00:00:00,"""DAP - Entregad…","""No Definido""","""900346811""","""AGG MRO""","""No""","""Si""","""No""","""Si""","""No""","""No""","""No""","""Distribuido""","""Inversión""",3708000000.0,0.0,1236000000.0,2472000000.0,1236000000.0,0.0,0.0,2472000000.0,"""No Válido""","""No Definido""","""No D""",3708000000.0,0.0,"""No""",20.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3524611&isFromPublicArea=True&isModal=true&asPopupView=true""}","""YOBANY ANDRES …","""CO""","""AVENIDA EL DOR…","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",3708000000.0,0.0,0.0,0.0,0.0,0.0,"""700409022""","""703192971""","""INSPECCIONES M…"


#### Crear columnas

In [None]:
datos_pagina.with_columns((pl.col("valor_del_contrato") /1000000).alias("Valor en millones")).select("Valor en millones").head()


Valor en millones
f32
16.5
10.96155
20.234249
30.5
3708.0


In [None]:

# Agregar varias columnas nuevas al DataFrame
datos_pagina.with_columns(
    [
        (pl.col("valor_del_contrato") /1000000).alias("Valor en millones"),
        pl.col("ciudad").str.lengths().alias("longitudes_ciudad"),
    ]
).select(['Valor en millones','ciudad','longitudes_ciudad']).head()


  pl.col("ciudad").str.lengths().alias("longitudes_ciudad"),


Valor en millones,ciudad,longitudes_ciudad
f32,str,u32
16.5,"""Cali""",4
10.96155,"""Medellín""",9
20.234249,"""Barranquilla""",12
30.5,"""Bogotá""",7
3708.0,"""Bogotá""",7


In [None]:

# Agregar una columna en el índice 0 que cuenta las filas
datos_pagina.with_row_count()

  datos_pagina.with_row_count()


row_nr,nombre_entidad,nit_entidad,departamento,ciudad,localizaci_n,orden,sector,rama,entidad_centralizada,proceso_de_compra,id_contrato,referencia_del_contrato,estado_contrato,codigo_de_categoria_principal,descripcion_del_proceso,tipo_de_contrato,modalidad_de_contratacion,justificacion_modalidad_de,fecha_de_firma,fecha_de_inicio_del_contrato,fecha_de_fin_del_contrato,condiciones_de_entrega,tipodocproveedor,documento_proveedor,proveedor_adjudicado,es_grupo,es_pyme,habilita_pago_adelantado,liquidaci_n,obligaci_n_ambiental,obligaciones_postconsumo,reversion,origen_de_los_recursos,destino_gasto,valor_del_contrato,valor_de_pago_adelantado,valor_facturado,valor_pendiente_de_pago,valor_pagado,valor_amortizado,valor_pendiente_de,valor_pendiente_de_ejecucion,estado_bpin,c_digo_bpin,anno_bpin,saldo_cdp,saldo_vigencia,espostconflicto,dias_adicionados,puntos_del_acuerdo,pilares_del_acuerdo,urlproceso,nombre_representante_legal,nacionalidad_representante_legal,domicilio_representante_legal,tipo_de_identificaci_n_representante_legal,identificaci_n_representante_legal,g_nero_representante_legal,presupuesto_general_de_la_nacion_pgn,sistema_general_de_participaciones,sistema_general_de_regal_as,recursos_propios_alcald_as_gobernaciones_y_resguardos_ind_genas_,recursos_de_credito,recursos_propios,codigo_entidad,codigo_proveedor,objeto_del_contrato
u32,str,f32,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,str,datetime[ns],datetime[ns],datetime[ns],str,str,str,str,str,str,str,str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,f32,f32,str,str,str,f32,f32,str,f32,str,str,struct[1],str,str,str,str,str,str,f32,f32,f32,f32,f32,f32,str,str,str
0,"""SECRETARIA DE …",8.9040e9,"""Valle del Cauc…","""Cali""","""Colombia, Val…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.21648…","""CO1.PCCNTR.275…","""1.310.02-59.2-…","""En ejecución""","""V1.72141003""","""Prestación de …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-08-12 00:00:00,2021-08-17 00:00:00,2021-12-31 00:00:00,"""A convenir""","""Cédula de Ciud…","""34317568""","""Victoria Eugen…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",1.65e7,0.0,1.65e7,0.0,1.65e7,0.0,0.0,0.0,"""Válido""","""2020003760195""","""2023""",1.5365e9,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.2168642&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Victoria Eugen…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,1.65e7,0.0,0.0,"""709412027""","""710837337""","""Prestación de …"
1,"""INSTITUCIÓN UN…",8.9098016e8,"""Antioquia""","""Medellín""","""Colombia, Ant…","""Territorial""","""Educación Naci…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.30981…","""CO1.PCCNTR.387…","""CMA-CD-9797-94…","""En ejecución""","""V1.80111600""","""El Contratista…","""Prestación de …","""Contratación d…","""Servicios prof…",2022-08-03 00:00:00,2022-08-04 00:00:00,2023-01-01 00:00:00,"""No Definido""","""Cédula de Ciud…","""8356403""","""GUSTAVO ADOLFO…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…",1.096155e7,0.0,1.096155e7,0.0,1.096155e7,0.0,0.0,0.0,"""No Válido""","""No Definido""","""No D""",4.2622e9,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3100647&isFromPublicArea=True&isModal=true&asPopupView=true""}","""GUSTAVO ADOLFO…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,1.096155e7,0.0,0.0,"""704629146""","""714239985""","""El Contratista…"
2,"""DISTRITO ESPEC…",8.90102016e8,"""Atlántico""","""Barranquilla""","""Colombia, Atl…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.16947…","""CO1.PCCNTR.216…","""CD-57-2021-045…","""En ejecución""","""V1.80111600""","""PRESTACIÓN DE …","""Prestación de …","""Contratación d…","""Servicios prof…",2021-01-28 00:00:00,2021-02-02 00:00:00,2022-01-01 00:00:00,"""No Definido""","""Cédula de Ciud…","""72142696""","""ANGEL VICENTE …","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",2.023425e7,0.0,2.023425e7,0.0,2.023425e7,0.0,0.0,0.0,"""Válido""","""2021080010005""","""2021""",2.0023e9,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.1690930&isFromPublicArea=True&isModal=true&asPopupView=true""}","""ANGEL VICENTE …","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,2.023425e7,0.0,0.0,"""702442096""","""709472161""","""PRESTACIÓN DE …"
3,"""OPERADORA DIST…",9.01526656e8,"""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Territorial""","""Transporte""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.47437…","""CO1.PCCNTR.533…","""CO1.PCCNTR.533…","""En ejecución""","""V1.82151500""","""PRESTACIÓN DE …","""Prestación de …","""Contratación r…","""Regla aplicabl…",2023-10-04 00:00:00,2023-10-26 00:00:00,2024-01-12 00:00:00,"""A convenir""","""Cédula de Ciud…","""53065957""","""Xiomara Meliss…","""No""","""No""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…",3.05e7,0.0,0.0,3.05e7,0.0,0.0,0.0,3.05e7,"""No Válido""","""No Definido""","""No D""",3.05e7,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.4829726&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Xiomara Meliss…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""Femenino""",0.0,0.0,0.0,3.05e7,0.0,0.0,"""715753901""","""708941216""","""PRESTACIÓN DE …"
4,"""COMANDO FAC""",8.99999104e8,"""Distrito Capit…","""Bogotá""","""Colombia, Bogo…","""Nacional""","""defensa""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.34296…","""CO1.PCCNTR.433…","""157-00-A-COFAC…","""Modificado""","""V1.78181800""","""INSPECCIONES M…","""Prestación de …","""Selección Abre…","""Defensa y segu…",2022-12-23 00:00:00,2022-12-27 00:00:00,2023-12-21 00:00:00,"""DAP - Entregad…","""No Definido""","""900346811""","""AGG MRO""","""No""","""Si""","""No""","""Si""","""No""","""No""","""No""","""Distribuido""","""Inversión""",3.7080e9,0.0,1.2360e9,2.4720e9,1.2360e9,0.0,0.0,2.4720e9,"""No Válido""","""No Definido""","""No D""",3.7080e9,0.0,"""No""",20.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.3524611&isFromPublicArea=True&isModal=true&asPopupView=true""}","""YOBANY ANDRES …","""CO""","""AVENIDA EL DOR…","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",3.7080e9,0.0,0.0,0.0,0.0,0.0,"""700409022""","""703192971""","""INSPECCIONES M…"
5,"""MUNICIPIO DE N…",8.91180032e8,"""Huila""","""Neiva""","""Colombia, Hui…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.43585…","""CO1.PCCNTR.447…","""CO1.PCCNTR.447…","""Activo""","""V1.93141701""","""REALIZAR,EL FE…","""Suministros""","""Mínima cuantía…","""Presupuesto in…",2018-09-18 00:00:00,2018-06-18 00:00:00,2018-06-28 00:00:00,"""Como acordado …","""No Definido""","""813011717""","""FUNDACION COLO…","""No""","""Si""","""No Definido""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",2.68e7,0.0,2.68e7,0.0,2.68e7,0.0,0.0,0.0,"""No Válido""","""No Definido""","""2018""",3e7,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.443273&isFromPublicArea=True&isModal=true&asPopupView=true""}","""ANDREA ESTEFAN…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,0.0,0.0,0.0,"""700642036""","""700866155""","""REALIZAR,EL FE…"
6,"""CENTRAL ADMINI…",9.00332544e8,"""Cundinamarca""","""Facatativá""","""Colombia, Cun…","""Nacional""","""defensa""","""Corporación Au…","""Centralizada""","""CO1.BDOS.62190…","""CO1.PCCNTR.686…","""ACEPTACIÓN DE …","""Activo""","""V1.81112300""","""MANTENIMIENTO …","""Prestación de …","""Mínima cuantía…","""Presupuesto in…",2018-12-13 00:00:00,2018-12-17 00:00:00,2019-11-03 00:00:00,"""Como acordado …","""No Definido""","""900034395""","""Black Hat Arch…","""No""","""Si""","""No Definido""","""No""","""No""","""No""","""No""","""Distribuido""","""Funcionamiento…",5.2276032e7,0.0,0.0,5.2276032e7,0.0,0.0,0.0,5.2276032e7,"""No Válido""","""No Definido""","""No D""",2.264046e6,1.5780e10,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.616883&isFromPublicArea=True&isModal=true&asPopupView=true""}","""RICARDO ANDRES…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,0.0,0.0,0.0,"""702917022""","""701895179""","""MANTENIMIENTO …"
7,"""DEPARTAMENTO D…",8.9200e9,"""Meta""","""Villavicencio""","""Colombia, Met…","""Territorial""","""Servicio Públi…","""Ejecutivo""","""Centralizada""","""CO1.BDOS.41511…","""CO1.PCCNTR.475…","""1413 DE 2023""","""terminado""","""V1.80111600""","""ASISTENCIA TÉC…","""Prestación de …","""Contratación d…","""Servicios prof…",2023-03-10 00:00:00,2023-03-13 00:00:00,2023-06-28 00:00:00,"""Como acordado …","""Cédula de Ciud…","""1121951978""","""JASSBLEYDI VAL…","""No""","""Si""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",1.1501e7,0.0,1.1501e7,0.0,1.1501e7,0.0,0.0,0.0,"""No Válido""","""No Definido""","""No D""",1.3144e7,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.4152721&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Jassbleydi Val…","""CO""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""Femenino""",0.0,0.0,0.0,1.1501e7,0.0,0.0,"""700817075""","""714180031""","""ASISTENCIA TÉC…"
8,"""MUNICIPIO DE M…",8.90984896e8,"""Antioquia""","""Murindó""","""Colombia, Ant…","""Territorial""","""No aplica/No p…","""Corporación Au…","""Centralizada""","""CO1.BDOS.66372…","""CO1.PCCNTR.725…","""CO1.PCCNTR.725…","""Cancelado""","""V1.80111600""","""Sin Descripcio…","""Prestación de …","""Contratación d…","""Servicios prof…",,,,"""No Definido""","""Sin Descripcio…","""No Definido""","""Sin Descripcio…","""No""","""No""","""No Definido""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"""No Válido""","""No Definido""","""No D""",0.0,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.655156&isFromPublicArea=True&isModal=true&asPopupView=true""}","""Sin Descripcio…","""No definido""","""No Definido""","""Sin Descripcio…","""Sin Descripcio…","""No Definido""",0.0,0.0,0.0,0.0,0.0,0.0,"""702596578""","""0""","""No definido"""
9,"""SERVICIO NACIO…",9.0000e9,"""Magdalena""","""Santa Marta""","""Colombia, Mag…","""Nacional""","""Trabajo""","""Ejecutivo""","""Centralizada""","""CO1.BDOS.46079…","""CO1.PCCNTR.520…","""CO1.PCCNTR.520…","""En ejecución""","""V1.31162800""","""CONTRATAR EL S…","""Suministros""","""Mínima cuantía…","""Presupuesto in…",2023-07-17 00:00:00,2023-07-28 00:00:00,2024-01-01 00:00:00,"""No Definido""","""No Definido""","""901231329""","""GAFIMOPE SAS""","""No""","""Si""","""No""","""No""","""No""","""No""","""No""","""Distribuido""","""Inversión""",5.607e7,0.0,0.0,5.607e7,0.0,0.0,0.0,5.607e7,"""No Válido""","""No Definido""","""No D""",7e7,0.0,"""No""",0.0,"""No aplica""","""No aplica""","{""https://community.secop.gov.co/Public/Tendering/OpportunityDetail/Index?noticeUID=CO1.NTC.4622736&isFromPublicArea=True&isModal=true&asPopupView=true""}","""KELLY JOHANNA …","""CO""","""mz f casa 14 v…","""Sin Descripcio…","""Sin Descripcio…","""Femenino""",5.607e7,0.0,0.0,0.0,0.0,0.0,"""702988379""","""710108200""","""CONTRATAR EL S…"


### Escritura y lectura de información desde bigquery

In [None]:
%%capture
pip install --upgrade google-cloud-bigquery

In [None]:
import numpy as np
from google.cloud import bigquery
import polars as pl
from google.oauth2 import service_account

key_path = r"C:\Users\nib1l\Downloads\motor-de-recomendaciones-1e3a4c8c8574.json" # cambiala por el nombre de tu llave
credentials = service_account.Credentials.from_service_account_file(
    key_path, scopes=["https://www.googleapis.com/auth/cloud-platform"],
)

client = bigquery.Client(credentials=credentials, project=credentials.project_id,)
# Define the connection parameters
project_id = "motor-de-recomendaciones" #nombre de tu proyecto
dataset_id = "datos_icfes" # nombre de tu dataset
table_id = "icefes_2019" #nombre de tu tabla

In [None]:


# Perform a query.
QUERY = ('SELECT * FROM `motor-de-recomendaciones.datos_icfes.icefes_2019` LIMIT 1000 ')
query_job = client.query(QUERY)  # API request
rows = query_job.result()  # Waits for query to finish

df = pl.from_arrow(rows.to_arrow())

In [None]:
import io
# Write DataFrame to stream as parquet file; does not hit disk
with io.BytesIO() as stream:
    df.write_parquet(stream)
    stream.seek(0)
    job = client.load_table_from_file(
        stream,
        destination='datos_icfes.icefes_20192',
        project='motor-de-recomendaciones',
        job_config=bigquery.LoadJobConfig(
            source_format=bigquery.SourceFormat.PARQUET,
        ),
    )
job.result()  # Waits for the job to complete

LoadJob<project=motor-de-recomendaciones, location=northamerica-northeast1, id=97f476c6-a27a-4a07-915e-3925b8a95470>

### Comparación


#### Lectura de datos
![](https://miro.medium.com/v2/resize:fit:828/format:webp/1*HWibbnVYohpKbpjMmL15rw.png)

#### Operaciones de agregación
![](https://miro.medium.com/v2/resize:fit:828/format:webp/1*7-xfg0arCNVTv4AG3yzTwg.png)

#### Filtros y selección

![](https://miro.medium.com/v2/resize:fit:828/format:webp/1*XR09526SmAUHrBwr0lfFzg.png)

#### Operación de clasificación
![](https://miro.medium.com/v2/resize:fit:828/format:webp/1*Blya6y4zfInlBPe-u2nOEA.png)

### Puedes ver más información

1. https://github.com/pola-rs/polars
2. https://medium.com/cuenex/pandas-2-0-vs-polars-the-ultimate-battle-a378eb75d6d1
3. https://docs.pola.rs/user-guide/