Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add CSV export capability to DataFrameClient #45

Merged
merged 20 commits into from
Apr 6, 2023
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/api_reference/dataframe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ nisystemlink.clients.dataframe
.. automethod:: append_table_data
.. automethod:: query_table_data
.. automethod:: query_decimated_data
.. automethod:: export_table_data
kjohn1922 marked this conversation as resolved.
Show resolved Hide resolved

.. automodule:: nisystemlink.clients.dataframe.models
:members:
Expand Down
16 changes: 12 additions & 4 deletions docs/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,13 +76,13 @@ Subscribe to tag changes
:language: python
:linenos:

Data Frame API
DataFrame API
-------

Overview
~~~~~~~~

The :class:`.DataFrameClient` class is the primary entry point of the Data Frame API.
The :class:`.DataFrameClient` class is the primary entry point of the DataFrame API.

When constructing a :class:`.DataFrameClient`, you can pass an
:class:`.HttpConfiguration` (like one retrieved from the
Expand All @@ -91,11 +91,14 @@ default connection. The default connection depends on your environment.

With a :class:`.DataFrameClient` object, you can:

* Create and delete Data Frame Tables.
* Create and delete data tables.

* Modify table metadata and query for tables by their metadata.

* Append rows of data to a table, query for rows of data from a table, and decimate table data.
* Append rows of data to a table, query for rows of data from a table, and
decimate table data.

* Export table data in a comma-separated values (CSV) format.

Examples
~~~~~~~~
Expand All @@ -111,3 +114,8 @@ Query and read data from a table
.. literalinclude:: ../examples/dataframe/query_read_data.py
:language: python
:linenos:

Export data from a table
.. literalinclude:: ../examples/dataframe/export_data.py
kjohn1922 marked this conversation as resolved.
Show resolved Hide resolved
:language: python
:linenos:
33 changes: 33 additions & 0 deletions examples/dataframe/export_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
from shutil import copyfileobj

import pandas as pd
from nisystemlink.clients.dataframe import DataFrameClient
from nisystemlink.clients.dataframe.models import (
ColumnFilter,
ColumnOrderBy,
ExportFormat,
ExportTableDataRequest,
FilterOperation,
)

client = DataFrameClient()

# List a table
response = client.list_tables(take=1)
table = response.tables[0]

# Export table data with query options
request = ExportTableDataRequest(
columns=['col1'],
order_by=[ColumnOrderBy('col2', descending=True)],
filters=[ColumnFilter(column='col1', operation=FilterOperation.NotEquals, value=0)],
response_format=ExportFormat.CSV)

data = client.export_table_data(id=table.id, query=request)

# Write the export data to a file
with open(f'{table.name}.csv', 'wb') as f:
copyfileobj(response, f)

# Alternatively, load the export data into a pandas dataframe
df = pd.read_csv(data)
28 changes: 28 additions & 0 deletions nisystemlink/clients/core/helpers/generator_file_like.py
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments on this:

  • It's returning the data as binary data. The binary can be written to a CSV file or to pandas directly, but if the user wants to get the data as a string in code, they will need to decode it themselves. We did this in anticipation of adding binary export formats in the future (TDMS).
  • I didn't implement readline or readlines, since they are not required for writing to a file or to pandas. I can add them if we want this to be a fully-functional file-like object, but they were getting a bit messy with iterating the binary and normalizing line endings to what python expects.

Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
"""A file-like object adapter that wraps a python generator, providing a way to
iterate over the generator as if it was a file.
"""


class GeneratorFileLike:
def __init__(self, generator):
self._generator = generator
self._buffer = b''

def read(self, size=-1):
"""Read at most `size` bytes from the file-like object. If `size` is not
specified or is negative, read until the generator is exhausted and
returns all bytes or characters read.
"""
while size < 0 or len(self._buffer) < size:
try:
chunk = next(self._generator)
self._buffer += chunk
except StopIteration:
break
if size < 0:
data = self._buffer
self._buffer = b''
else:
data = self._buffer[:size]
self._buffer = self._buffer[size:]
return data
55 changes: 40 additions & 15 deletions nisystemlink/clients/dataframe/_data_frame_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
from nisystemlink.clients import core
from nisystemlink.clients.core._uplink._base_client import BaseClient
from nisystemlink.clients.core._uplink._methods import delete, get, patch, post
from uplink import Body, Field, Path, Query
from nisystemlink.clients.core.helpers.generator_file_like import GeneratorFileLike
from uplink import Body, Field, Path, Query, response_handler

from . import models

Expand All @@ -21,7 +22,7 @@ def __init__(self, configuration: Optional[core.HttpConfiguration] = None):
is used.

Raises:
ApiException: if unable to communicate with the Data Frame service.
ApiException: if unable to communicate with the DataFrame Service.
kjohn1922 marked this conversation as resolved.
Show resolved Hide resolved
"""
if configuration is None:
configuration = core.JupyterHttpConfiguration()
Expand All @@ -36,7 +37,7 @@ def api_info(self) -> models.ApiInfo:
Information about available API operations.

Raises:
ApiException: if unable to communicate with the Data Frame service.
ApiException: if unable to communicate with the DataFrame Service.
"""
...

Expand Down Expand Up @@ -74,7 +75,7 @@ def list_tables(
The list of tables with a continuation token.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -90,7 +91,7 @@ def create_table(self, table: models.CreateTableRequest) -> str:
The ID of the newly created table.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -106,7 +107,7 @@ def query_tables(self, query: models.QueryTablesRequest) -> models.PagedTables:
The list of tables with a continuation token.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -122,7 +123,7 @@ def get_table_metadata(self, id: str) -> models.TableMetadata:
The metadata for the table.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -136,7 +137,7 @@ def modify_table(self, id: str, update: models.ModifyTableRequest) -> None:
update: The metadata to update.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -149,7 +150,7 @@ def delete_table(self, id: str) -> None:
id (str): Unique ID of a DataFrame table.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -168,7 +169,7 @@ def delete_tables(
tables were deleted successfully.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -187,7 +188,7 @@ def modify_tables(
tables were modified successfully.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand Down Expand Up @@ -230,7 +231,7 @@ def get_table_data(
The table data and total number of rows with a continuation token.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -244,7 +245,7 @@ def append_table_data(self, id: str, data: models.AppendTableDataRequest) -> Non
data: The rows of data to append and any additional options.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -263,7 +264,7 @@ def query_table_data(
The table data and total number of rows with a continuation token.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
Expand All @@ -282,7 +283,31 @@ def query_decimated_data(
The decimated table data.

Raises:
ApiException: if unable to communicate with the Data Frame service
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...

@response_handler()
def iter_content_filelike_wrapper(response):
return GeneratorFileLike(response.iter_content(chunk_size=4096))

@iter_content_filelike_wrapper
@post("tables/{id}/export-data", args=[Path, Body])
def export_table_data(
self, id: str, query: models.ExportTableDataRequest
):
"""Exports rows of data that match a filter from the table identified by its ID.

Args:
id: Unique ID of a DataFrame table.
query: The filtering, sorting, and export format to apply when exporting data.

Returns:
A file-like object for reading the exported data.

Raises:
ApiException: if unable to communicate with the DataFrame Service
or provided an invalid argument.
"""
...
6 changes: 4 additions & 2 deletions nisystemlink/clients/dataframe/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@
from ._api_info import ApiInfo, Operation, OperationsV1
from ._create_table_request import CreateTableRequest
from ._column import Column
from ._column_filter import FilterOperation, ColumnFilter
from ._column_order_by import ColumnOrderBy
from ._column_type import ColumnType
from ._data_frame import DataFrame
from ._data_type import DataType
from ._delete_tables_partial_success import DeleteTablesPartialSuccess
from ._export_table_data_request import ExportTableDataRequest, ExportFormat
from ._modify_tables_partial_success import ModifyTablesPartialSuccess
from ._modify_table_request import ColumnMetadataPatch, ModifyTableRequest
from ._modify_tables_request import ModifyTablesRequest, TableMetdataModification
Expand All @@ -17,8 +20,7 @@
DecimationOptions,
QueryDecimatedDataRequest,
)
from ._query_table_data_base import ColumnFilter, FilterOperation
from ._query_table_data_request import ColumnOrderBy, QueryTableDataRequest
from ._query_table_data_request import QueryTableDataRequest
from ._query_tables_request import QueryTablesRequest
from ._table_metadata import TableMetadata
from ._table_rows import TableRows
Expand Down
38 changes: 38 additions & 0 deletions nisystemlink/clients/dataframe/models/_column_filter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
from enum import Enum
from typing import Optional

from nisystemlink.clients.core._uplink._json_model import JsonModel


class FilterOperation(str, Enum):
"""Represents the different operations that can be used in a filter."""

Equals = "EQUALS"
NotEquals = "NOT_EQUALS"
LessThan = "LESS_THAN"
LessThanEquals = "LESS_THAN_EQUALS"
GreaterThan = "GREATER_THAN"
GreaterThanEquals = "GREATER_THAN_EQUALS"
Contains = "CONTAINS"
NotContains = "NOT_CONTAINS"


class ColumnFilter(JsonModel):
"""A filter to apply to the table data."""

column: str
"""The name of the column to use for filtering."""

operation: FilterOperation
"""How to compare the column's value with the specified value.

An error is returned if the column's data type does not support the specified operation:
* String columns only support ``EQUALS``, ``NOT_EQUALS``, ``CONTAINS``, and ``NOT_CONTAINS``.
* Non-string columns do not support ``CONTAINS`` or ``NOT_CONTAINS``.
* When ``value`` is ``None``, the operation must be ``EQUALS`` or ``NOT_EQUALS``.
* When ``value`` is ``NaN`` for a floating-point column, the operation must be ``NOT_EQUALS``.
"""

value: Optional[str]
"""The comparison value to use for filtering. An error will be returned if
the value cannot be converted to the column's data type."""
13 changes: 13 additions & 0 deletions nisystemlink/clients/dataframe/models/_column_order_by.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from typing import Optional

from nisystemlink.clients.core._uplink._json_model import JsonModel


class ColumnOrderBy(JsonModel):
"""Specifies a column to order by and the ordering direction."""

column: str
"""The name of the column to order by."""

descending: Optional[bool] = None
"""Whether the ordering should be in descending order."""
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
from enum import Enum
from typing import List, Optional

from nisystemlink.clients.core._uplink._json_model import JsonModel

from ._column_filter import ColumnFilter
from ._column_order_by import ColumnOrderBy


class ExportFormat(str, Enum):
"""The format of the exported data."""

CSV = 'CSV'
"""Comma-separated values."""


class ExportTableDataRequest(JsonModel):
"""Specifies the parameters for a data export with ordering and filtering."""

columns: Optional[List[str]] = None
"""The names of columns to include in the export. The export will
include the columns in the same order specified in this parameter. All
columns are included in the order specified at table creation if this
property is excluded."""

order_by: Optional[List[ColumnOrderBy]] = None
"""A list of columns to order the results by. Multiple columns may be
specified to order rows that have the same value for prior columns. The
columns used for sorting do not need to be included in the columns list, in
which case they are not included in the export."""

filters: Optional[List[ColumnFilter]] = None
"""A list of columns to filter by. Only rows whose columns contain values
matching all of the specified filters are returned. The columns used for
filtering do not need to be included in the columns list, in which case
they are not included in the export."""

response_format: ExportFormat = None
"""The format of the exported data. The only response format
currently supported is ``CSV``."""
Loading