Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TableStatistics implementation for dolt side #3365

Merged
merged 49 commits into from Jun 27, 2022
Merged

TableStatistics implementation for dolt side #3365

merged 49 commits into from Jun 27, 2022

Conversation

jycor
Copy link
Contributor

@jycor jycor commented May 6, 2022

Implements the sql.StatisticsTable interface

@timsehn
Copy link
Sponsor Contributor

timsehn commented Jun 14, 2022

Also, why does this have no release notes? @jcor11599 you are over using this.

@jycor jycor changed the title [no-release-notes] column_statistics table in mysql db column_statistics table in mysql db Jun 14, 2022
@jycor jycor changed the title column_statistics table in mysql db column_statistics table in information_schema Jun 21, 2022
@jycor jycor changed the title column_statistics table in information_schema TableStatistics implementation for dolt side Jun 23, 2022
Copy link
Contributor

@max-hoffman max-hoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty good, some nits for docstrings, worth refactoring the histogram building to only write it once on the GMS side

// AnalyzeTable implements the sql.StatisticsTable interface.
func (t *DoltTable) AnalyzeTable(ctx *sql.Context) error {
table, err := t.DoltTable(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now might be a good time to make either a HistogramBuilder or TableStatisticsBuilder in GMS to avoid the duplication here. A couple ways of doing that: i) builder for each table, consumes rows; ii) constructor is passed an sql.RowIter, it exhausts the row iter and returns map[string]sql.TableStatistics

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(maybe fold with info schema fix?)

@@ -98,10 +99,16 @@ func (ds *DoltTableStatistics) NullCount() uint64 {
}

func (ds *DoltTableStatistics) Histogram(colName string) (*sql.Histogram, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need DoltTableStatistics? just to wrap map[string]sql.TableStatistics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it implements sql.TableStatistics

}
}

// TODO: logic to determine ranges for buckets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merged comments should either be: 1) consolidated into a docstring to describe what the function does, or 2) explain tricky edge cases that future readers would stumble on

rowCount: numRows,
}, nil

// Load stats from the session
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above about docstrings

@jycor jycor merged commit 369435f into main Jun 27, 2022
@Hydrocharged Hydrocharged deleted the james/statistics branch October 13, 2022 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants