Fix empty TAG column result in to_dataframe when querying table model. by ColinLeeo · Pull Request #730 · apache/tsfile

ColinLeeo · 2026-02-25T06:09:02Z

No description provided.

codecov-commenter · 2026-02-25T06:30:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.02%. Comparing base (5feb69d) to head (777b1cb).

Additional details and impacted files

@@           Coverage Diff            @@
##           develop     #730   +/-   ##
========================================
  Coverage    62.02%   62.02%           
========================================
  Files          700      700           
  Lines        40142    40142           
  Branches      5650     5650           
========================================
  Hits         24897    24897           
  Misses       14551    14551           
  Partials       694      694

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This pull request fixes an issue where querying a table model with only TAG or ATTRIBUTE columns (no FIELD columns) would return empty results. The fix introduces a no_data_query flag to detect when only non-FIELD columns are requested, and in such cases, queries all columns from the underlying data source and then filters the result to include only the requested columns plus the time column.

Changes:

Added logic to detect queries with no FIELD columns (no_data_query flag)
When no FIELD columns are requested, the query retrieves all columns and filters the result
Added support for case-insensitive column and table name handling
Added tests to verify correct behavior when querying single FIELD columns or time columns

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
python/tsfile/utils.py	Added `no_data_query` flag and column filtering logic to handle queries with only TAG/ATTRIBUTE columns; added TIME_COLUMN import; converted table/column names to lowercase for case-insensitive matching
python/tests/test_load_tsfile_from_iotdb.py	Added test cases to verify correct behavior when querying single columns and time columns

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T08:01:13Z

python/tsfile/utils.py

+                        _column_names.insert(0, TIME_COLUMN)
+                        dataframe = dataframe[_column_names]


This code modifies the caller's original list by calling insert on _column_names. Since _column_names is a reference to the column_names parameter (after lowercasing at line 128), this mutates the caller's list. This mutation will cause issues if the same list is reused across multiple calls or if the caller expects their list to remain unchanged. Additionally, this mutation happens inside the iterator loop, so it will occur for every batch of data, repeatedly inserting TIME_COLUMN at position 0, which will cause incorrect behavior. The insertion should be done on a copy of the list, not on the original reference.

Suggested change

_column_names.insert(0, TIME_COLUMN)

dataframe = dataframe[_column_names]

columns_with_time = [TIME_COLUMN] + list(_column_names)

dataframe = dataframe[columns_with_time]

Copilot · 2026-02-25T08:01:13Z

python/tsfile/utils.py

+                    if no_data_query and _column_names is not None:
+                        _column_names.insert(0, TIME_COLUMN)


This line inserts TIME_COLUMN (the constant "time") instead of the actual time_column variable. If the table schema defines a time column with a different name (e.g., "id"), and that column name is not explicitly requested by the user, the dataframe will have a column with the schema's time column name, not "time". When trying to filter the dataframe by _column_names at line 186, pandas will raise a KeyError because "time" doesn't exist in the dataframe. This should use time_column instead of TIME_COLUMN. Additionally, this insertion should only happen if time_column is not already in _column_names.

Suggested change

if no_data_query and _column_names is not None:

_column_names.insert(0, TIME_COLUMN)

if no_data_query and _column_names is not None and time_column is not None:

if time_column not in _column_names:

_column_names.insert(0, time_column)

python/tsfile/utils.py

Fix empty TAG column result in to_dataframe when querying table model.

777b1cb

ColinLeeo requested a review from Copilot February 25, 2026 07:55

Copilot started reviewing on behalf of ColinLeeo February 25, 2026 07:55 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

jt2594838 approved these changes Feb 25, 2026

View reviewed changes

python/tsfile/utils.py Outdated Show resolved Hide resolved

rename.

8f0cc31

ColinLeeo merged commit ebb4d97 into develop Feb 25, 2026
16 checks passed

ColinLeeo deleted the fix_tag_in_to_dataframe branch February 25, 2026 10:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix empty TAG column result in to_dataframe when querying table model.#730

Fix empty TAG column result in to_dataframe when querying table model.#730
ColinLeeo merged 2 commits intodevelopfrom
fix_tag_in_to_dataframe

ColinLeeo commented Feb 25, 2026

Uh oh!

codecov-commenter commented Feb 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		_column_names.insert(0, TIME_COLUMN)
		dataframe = dataframe[_column_names]

		if no_data_query and _column_names is not None:
		_column_names.insert(0, TIME_COLUMN)

-                    if no_data_query and _column_names is not None:
-                        _column_names.insert(0, TIME_COLUMN)
+                    if no_data_query and _column_names is not None and time_column is not None:
+                        if time_column not in _column_names:
+                            _column_names.insert(0, time_column)

Conversation

ColinLeeo commented Feb 25, 2026

Uh oh!

codecov-commenter commented Feb 25, 2026

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants