[Data Store] My Sql - target, source and driver(storey) #2407

davesh0812 · 2022-09-19T12:54:01Z

This pr focus in implementation of source, target and driver for SqlDB.
SqlDB target can create or read from exists sql collection. The collection is fixed and can't change is schema in the flow.
SqlDriver is not fully supported in the aggregation query flow.

https://jira.iguazeng.com/browse/ML-2610

This reverts commit 1287266.

This reverts commit a7f10ee.

This reverts commit 7ca059d.

…rget2

gtopper · 2022-10-04T05:28:57Z

mlrun/datastore/sources.py

+            engine = db.create_engine(db_path)
+            metadata = db.MetaData()
+            connection = engine.connect()
+            collection = db.Table(


Why call a table a "collection"? Where does this terminology come from? It looks like the term collection only exists in Oracle's PL/SQL, and it doesn't mean a table there either.

gtopper · 2022-10-04T05:32:03Z

mlrun/datastore/storeySourse.py

+class SqlDBSourceStorey(storey.sources._IterableSource, storey.sources.WithUUID):
+    """Use mongodb collection as input source for a flow.


Looks like a copy and paste accident.

gtopper · 2022-10-04T05:32:26Z

mlrun/datastore/storeySourse.py

+from storey.dtypes import _termination_obj
+
+
+class SqlDBSourceStorey(storey.sources._IterableSource, storey.sources.WithUUID):


Please open a storey PR for this.

Also storeySourse.py is not a correct file name. Would be storey_source.py. But the correct resolution is to move this to storey.

gtopper · 2022-10-04T05:36:02Z

mlrun/datastore/storeySourse.py

+            self.collection_name, metadata, autoload=True, autoload_with=engine
+        )
+        results = connection.execute(db.select([collection])).fetchall()
+        df = pandas.DataFrame(results)


The convention is to import pandas as pd.

@gtopper I know it, but in source.py in storey you imported it as import pandas so I did the same, because one day we will merge it.

gtopper · 2022-10-04T05:38:36Z

mlrun/datastore/targets.py

+            name=self.name or "SqlTarget",
+            after=after,
+            graph_shape="cylinder",
+            class_name="storey.NoSqlTarget",


Seems like the terminology is off when SqlDBTarget translates to NoSqlTarget.

@gtopper NoSqlTarget using the driver(table) for write & read the data.

gtopper · 2022-10-04T05:56:30Z

tests/system/feature_store/test_sql_db.py

+def _are_mongodb_connection_string_not_set() -> bool:
+    return False


So the test never runs? Also the double negation here is confusing.

I changed it to be more understandable... and it always run

gtopper · 2022-10-04T06:07:51Z

mlrun/datastore/targets.py

+                db.Table(collection_name, metadata, *columns)
+                metadata.create_all(engine)


I don't think it's right to create a SQL table inside of the constructor. Table creation should happen just before writing. The way it is now, simple creating a SqlDBTarget will try to create the SQL table, and that shouldn't happen until the user runs an operation (like ingest).

gtopper · 2022-10-04T06:08:11Z

mlrun/datastore/targets.py

+            header=True,
+            table=table,
+            index_cols=key_columns,
+            # storage_options=self._get_store().get_storage_options(),


Dead code ☠️

gtopper · 2022-10-04T06:09:06Z

mlrun/datastore/targets.py

+    ):
+        import sqlalchemy as db
+
+        # {‘fail’, ‘replace’, ‘append’} #


Dead code ☠️

gtopper · 2022-10-04T06:12:02Z

mlrun/datastore/targets.py

@@ -36,6 +36,7 @@


 class TargetTypes:
+    mongodb = "mongodb"


What does mongo have to do with this PR though?

davesh0812 · 2023-01-03T10:20:04Z

#2869

davesh0812 and others added 30 commits March 10, 2022 10:15

fix bug for loging extra data in all kinds of artifacts

7f7899c

fix bug for https://jira.iguazeng.com/browse/ML-1936

484af50

Merge branch 'mlrun:development' into development

ddb83bd

Merge branch 'mlrun:development' into development

2c3ebf9

black+isort

97890dc

Merge branch 'mlrun:development' into development

ff6fe5e

Merge branch 'mlrun:development' into development

ea7438f

support filter for story engine

a7f10ee

support filter for spark engine

1287266

Revert "support filter for spark engine"

c21b21f

This reverts commit 1287266.

Revert "support filter for story engine"

184cbbc

This reverts commit a7f10ee.

method to local function

7ca059d

Merge branch 'mlrun:development' into development

4cd0189

mongodb (pandas engine only)

08574ff

Revert "method to local function"

8c0625c

This reverts commit 7ca059d.

undo del _id

1dfbf60

_id to string

62740c2

_id to string + DB

5366114

one line func

4043de4

Merge remote-tracking branch 'origin/development' into development

5be56f7

to static method

4846ec8

to_step template

6173ec6

mongodb

b85103b

to_step works without time filter

f3ad4bc

comments

3c40df5

lint

b92f853

mongodb test

ddbd4a4

pymongo to requirements.txt

2a6c80d

imports

78ffea9

imports

53b1bc2

davesh0812 and others added 13 commits August 25, 2022 11:38

Merge remote-tracking branch 'origin/mongodb-target2' into mongodb-ta…

b912beb

…rget2

lint

b679520

requirements.txt

d7062ce

Merge branch 'mlrun:development' into development

3884740

Merge branch 'mlrun:development' into development

2131276

Merge branch 'mlrun:development' into development

b1dbedd

Merge branch 'mlrun:development' into development

5838380

Merge branch 'mlrun:development' into development

5bf8bc6

Merge branch 'development' into mongodb-target2

f5b8da7

Merge branch 'mlrun:development' into development

188fcdd

review

25a3f87

only sql db

5c44a62

lint

8bd2433

gtopper requested changes Oct 4, 2022

View reviewed changes

davesh0812 and others added 14 commits October 11, 2022 14:35

review

8239217

review

3465111

Merge branch 'mlrun:development' into development

00ac3e5

test + create table not in the init

084bdd4

Merge branch 'development' into sql-target-and-driver

4e6f532

del storey from mlrun

0cfec0c

test + update option

c6c79f6

lint

126f385

if_exists

f23603a

del unnecessary function

923b525

SqlDB -> Sql

8bd0c04

SqlDB -> SQL

61006b5

lint

6d4dab5

SqlDB -> SQL

69eadfc

davesh0812 closed this Jan 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data Store] My Sql - target, source and driver(storey) #2407

[Data Store] My Sql - target, source and driver(storey) #2407

davesh0812 commented Sep 19, 2022 •

edited

gtopper Oct 4, 2022

gtopper Oct 4, 2022

gtopper Oct 4, 2022

gtopper Oct 4, 2022

gtopper Oct 4, 2022

davesh0812 Oct 11, 2022

gtopper Oct 4, 2022

davesh0812 Oct 11, 2022

gtopper Oct 4, 2022

davesh0812 Oct 12, 2022

gtopper Oct 4, 2022

gtopper Oct 4, 2022

gtopper Oct 4, 2022

gtopper Oct 4, 2022

davesh0812 commented Jan 3, 2023

		class SqlDBSourceStorey(storey.sources._IterableSource, storey.sources.WithUUID):
		"""Use mongodb collection as input source for a flow.

		from storey.dtypes import _termination_obj


		class SqlDBSourceStorey(storey.sources._IterableSource, storey.sources.WithUUID):

		def _are_mongodb_connection_string_not_set() -> bool:
		return False

		db.Table(collection_name, metadata, *columns)
		metadata.create_all(engine)

[Data Store] My Sql - target, source and driver(storey) #2407

[Data Store] My Sql - target, source and driver(storey) #2407

Conversation

davesh0812 commented Sep 19, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davesh0812 commented Jan 3, 2023

davesh0812 commented Sep 19, 2022 •

edited