Skip to content

Fix empty TAG column result in to_dataframe when querying table model.#730

Merged
ColinLeeo merged 2 commits intodevelopfrom
fix_tag_in_to_dataframe
Feb 25, 2026
Merged

Fix empty TAG column result in to_dataframe when querying table model.#730
ColinLeeo merged 2 commits intodevelopfrom
fix_tag_in_to_dataframe

Conversation

@ColinLeeo
Copy link
Copy Markdown
Contributor

No description provided.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.02%. Comparing base (5feb69d) to head (777b1cb).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop     #730   +/-   ##
========================================
  Coverage    62.02%   62.02%           
========================================
  Files          700      700           
  Lines        40142    40142           
  Branches      5650     5650           
========================================
  Hits         24897    24897           
  Misses       14551    14551           
  Partials       694      694           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes an issue where querying a table model with only TAG or ATTRIBUTE columns (no FIELD columns) would return empty results. The fix introduces a no_data_query flag to detect when only non-FIELD columns are requested, and in such cases, queries all columns from the underlying data source and then filters the result to include only the requested columns plus the time column.

Changes:

  • Added logic to detect queries with no FIELD columns (no_data_query flag)
  • When no FIELD columns are requested, the query retrieves all columns and filters the result
  • Added support for case-insensitive column and table name handling
  • Added tests to verify correct behavior when querying single FIELD columns or time columns

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
python/tsfile/utils.py Added no_data_query flag and column filtering logic to handle queries with only TAG/ATTRIBUTE columns; added TIME_COLUMN import; converted table/column names to lowercase for case-insensitive matching
python/tests/test_load_tsfile_from_iotdb.py Added test cases to verify correct behavior when querying single columns and time columns

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +185 to +186
_column_names.insert(0, TIME_COLUMN)
dataframe = dataframe[_column_names]
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code modifies the caller's original list by calling insert on _column_names. Since _column_names is a reference to the column_names parameter (after lowercasing at line 128), this mutates the caller's list. This mutation will cause issues if the same list is reused across multiple calls or if the caller expects their list to remain unchanged. Additionally, this mutation happens inside the iterator loop, so it will occur for every batch of data, repeatedly inserting TIME_COLUMN at position 0, which will cause incorrect behavior. The insertion should be done on a copy of the list, not on the original reference.

Suggested change
_column_names.insert(0, TIME_COLUMN)
dataframe = dataframe[_column_names]
columns_with_time = [TIME_COLUMN] + list(_column_names)
dataframe = dataframe[columns_with_time]

Copilot uses AI. Check for mistakes.
Comment on lines +184 to +185
if no_data_query and _column_names is not None:
_column_names.insert(0, TIME_COLUMN)
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line inserts TIME_COLUMN (the constant "time") instead of the actual time_column variable. If the table schema defines a time column with a different name (e.g., "id"), and that column name is not explicitly requested by the user, the dataframe will have a column with the schema's time column name, not "time". When trying to filter the dataframe by _column_names at line 186, pandas will raise a KeyError because "time" doesn't exist in the dataframe. This should use time_column instead of TIME_COLUMN. Additionally, this insertion should only happen if time_column is not already in _column_names.

Suggested change
if no_data_query and _column_names is not None:
_column_names.insert(0, TIME_COLUMN)
if no_data_query and _column_names is not None and time_column is not None:
if time_column not in _column_names:
_column_names.insert(0, time_column)

Copilot uses AI. Check for mistakes.
@ColinLeeo ColinLeeo merged commit ebb4d97 into develop Feb 25, 2026
16 checks passed
@ColinLeeo ColinLeeo deleted the fix_tag_in_to_dataframe branch February 25, 2026 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants