-
Notifications
You must be signed in to change notification settings - Fork 697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support run extendedSQL for hive #319
Support run extendedSQL for hive #319
Conversation
4e115c0
to
3e9660e
Compare
3e9660e
to
b889cd1
Compare
Dockerfile
Outdated
@@ -4,7 +4,7 @@ RUN apt-get update | |||
RUN apt-get install -y python3-pip | |||
|
|||
RUN pip3 install --upgrade pip | |||
RUN pip3 install tensorflow mysql-connector-python pyhive jupyter sqlflow | |||
RUN pip3 install tensorflow mysql-connector-python thrift pyhive jupyter sqlflow | |||
# Fix jupyter server "connecting to kernel" problem | |||
# https://github.com/jupyter/notebook/issues/2664#issuecomment-468954423 | |||
RUN pip3 install tornado==4.5.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it also possible to incorporate this line?
sql/codegen.go
Outdated
return &columnType{n, t} | ||
} | ||
|
||
func translateColumnType(ct *columnType) columnType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about use a constant map to define the type transform, like:
var column2feature = map[string]string{
"FLOAT": "numeric_column",
"FLOAT_TYPE" : "numeric_column" // for hive only
}
Apologize this pr is a bit complicated. Most commits here support run
|
return fmt.Errorf("flush to %s, error:%v", w.table, e) | ||
} | ||
w.buf = w.buf[:0] | ||
w.flushID++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious that if we are writing multiple versions of model parameters to DB ( have many flushIDs) then how to determine which model to use when running prediction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there already checks that can enure allways writing parameters of the same model? Like if a user trained a DNN
model first, then he change the model type to LR
but didn't change the table name where we save the model parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are considering using independent storage media to save models. Let's keep on discussing in that thread.
// HIVE and ODPS don't support AUTO_INCREMENT | ||
// Hive and ODPS don't support BLOB, use BINARY instead | ||
var stmt string | ||
if driver == "mysql" || driver == "sqlite3" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need DriverType
type to unify these checks in order to avoid things like coding mistakes like "Hive" == "hive"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
This will be fixed in Unify DriverType name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Excellent job!
fix #318
TODO: