New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Export SqlDump #205 #1552

Merged
merged 8 commits into from May 16, 2018

Conversation

@tcbuzor
Copy link
Collaborator

tcbuzor commented Mar 25, 2018

Code changes for Feature Request #205. New UI screen for exporting SQL Insert and Create statements in Preview and Download mode.
screen shot 2018-03-24 at 10 49 31 pm
screen shot 2018-03-24 at 10 49 45 pm

tcbuzor and others added some commits Mar 11, 2018

TonyO TonyO
@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Mar 25, 2018

@tcbuzor What's it look like with 10 columns ?

@tcbuzor

This comment has been minimized.

Copy link
Collaborator Author

tcbuzor commented Mar 25, 2018

Screen with 11 columns attached.
screen shot 2018-03-24 at 10 57 47 pm

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Mar 25, 2018

@tcbuzor 1 more option on the bottom please... "remove whitespace from column names", the Column 1 should be "Column 1" VARCHAR(1000), etc. so I think having that quick option instead of forcing users to perform a rename on each column (painful) would be a "good thing".

CREATE TABLE clipboard (
record VARCHAR(1000),
Column 1 VARCHAR(1000),
values 2 VARCHAR(1000),
values 3 VARCHAR(1000),
values 4 VARCHAR(1000),
values 5 VARCHAR(1000),
values 6 VARCHAR(1000),
values 7 VARCHAR(1000),
values 8 VARCHAR(1000)
);
INSERT INTO clipboard (record,Column 1,values 2,values 3,values 4,values 5,values 6,values 7,values 8) VALUES 
( 'Aldeasoña 2010' ),
( 'Spain' ),
( 'Castilla-y-León' ),
( 'Vino de la Tierra	' ),
( 'Bodega Convento San Francisco, S.L. (Producer)' ),
( '+39 983 87 80 52' ),
( 'www.bodegaconvento.com' ),
( 'record','record','Alfredo Santamaria 2013','Spain','Castilla-y-León','Cigales	','Alfredo Santamaria Arias, S.L. (Producer)','+34 983 58 50 06','www.bodega-santamaria.com' ),
( 'Alfredo Santamaria 2013','Alfredo Santamaria 2013','Spain','Castilla-y-León','Cigales	','Alfredo Santamaria Arias, S.L. (Producer)','+34 983 58 50 06','www.bodega-santamaria.com' ),
( 'Spain','Alfredo Santamaria 2013','Spain','Castilla-y-León','Cigales	','Alfredo Santamaria Arias, S.L. (Producer)','+34 983 58 50 06','www.bodega-santamaria.com' )

Also, another nice feature to have... if I facet on only 24 rows of my 100 rows... I expect that the Export to SQL will only preview those 24 rows that I have a Text facet on. When I tried this, it was previewing all rows, no matter my Facet options...otherwise the preview ends up looking like above, where there is a schema alignment issue because it is previewing the first 10 rows of my 100 rows, where some are not in the right alignment and I knew that and why I applied a facet on only the ones rows I wanted to export to SQL and preview.

@tcbuzor

This comment has been minimized.

Copy link
Collaborator Author

tcbuzor commented Mar 26, 2018

@thadguidry Check the latest code. Added the trim and facet options.
screen shot 2018-03-25 at 9 02 29 pm

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Mar 26, 2018

@tcbuzor umm... I wasn't saying to ignore the facets, but to respect them ?

@tcbuzor

This comment has been minimized.

Copy link
Collaborator Author

tcbuzor commented Mar 26, 2018

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Mar 26, 2018

@tcbuzor ok, the Facet respecting works now. But not the trim on column names for preview or download..
capture

CREATE TABLE clipboard (
record VARCHAR(255),
Column 1 VARCHAR(255),
values 2 VARCHAR(255),
values 3 VARCHAR(255),
values 4 VARCHAR(255),
values 5 VARCHAR(255),
values 6 VARCHAR(255),
values 7 VARCHAR(255),
values 8 VARCHAR(255)
);
INSERT INTO clipboard (record,Column 1,values 2,values 3,values 4,values 5,values 6,values 7,values 8) VALUES 
( 'record','record','Alfredo Santamaria 2013','Spain','Castilla-y-León','Cigales	','Alfredo Santamaria Arias, S.L. (Producer)','+34 983 58 50 06','www.bodega-santamaria.com' ),
( 'record','record','Al-Ria Reserva 2015','Portugal','Algarve','Vinho Regional	','Casa Santos Lima (Producer)','+351 2 63 76 90 93;+351 2 63 76 06 21','www.casasantoslima.com' ),
( 'record','record','Amaren Tempranillo Reserva 2009','Spain','Rioja','Rioja Alavesa	','Araex Rioja Alavesa SL (Producer)','+34 945 15 05 88','www.araex.com' ),
( 'record','record','Aponte Reserva 2008','Spain','Castilla-y-León','Toro	','Bodegas Frontaura SLU (Producer)','+34 983 88 04 88','www.bodegasfrontaura.com' ),
( 'record','record','Arzuaga Reserva Especial 2011','Spain','Castilla-y-León','Ribera del Duero	','Bodegas Arzuaga Navarro SL (Producer)','+34 983 68 11 46','www.arzuaganavarro.com' ),
( 'record','record','Baigorri Belus 2013','Spain','Rioja','Rioja Alavesa	','Bodegas Baigorri (Producer)','+34 945 60 94 20','www.bodegasbaigorri.com' ),
( 'record','record','Bottega Il Vino degli Dei 2012','Italy','Veneto','Amarone della Valpolicella DOCG	','Bottega SPA (Producer)','+39 04 38 40 67','www.bottegaspa.com' ),
( 'record','record','Burro Loco Rosado 2016','Spain','Castilla-y-León','Vino de la Tierra	','Concejo Bodegas SL (Producer)','+34 983 50 22 63','www.concejobodegas.com' ),
( 'record','record','Cancellaia 2012','Italy','Toscana','Toscana IGT	','Az. Agr. Pakravan-Papi (Producer)','+39 3356 00 44 46','www.pakravan-papi.it' ),
( 'record','record','Cardela 2014','Spain','Castilla-y-León','Ribera del Duero	','Bodegas Bohórquez, S.L. (Producer)','+34 915 64 37 51','www.bodegasbohorquez.com' ),
( 'record','record','Casa Ferreirinha Vinha Grande Red 2014','Portugal','Porto E Douro','Douro	','Sogrape Vinhos SA (Producer)','+351 227 86 80 08;+351 227 85 03 00','www.sograpevinhos.com' ),
( 'record','record','Casa Mayor Old Vines Cabernet Sauvignon 2016','Chile','Valle de Colchagua','Viña Casa Silva SA (Producer)','+56 7 22 71 65 19','www.casasilva.cl' ),
( 'record','record','Castello di Vicarello 2012','Italy','Toscana','Toscana IGT	','Castello di Vicarello (Producer)','+39 0564 99 04 47','www.castellodivicarellovini.com' ),
( 'record','record','Castillo Labastida Ermita Santa Lucia 2016','Spain','Rioja','Rioja Alavesa	','Araex Rioja Alavesa SL (Producer)','+34 945 15 05 88','www.araex.com' ),
( 'record','record','Castroviejo 2013','Spain','Rioja','Rioja DOCa	','Bodegas Pastor Diaz Sl (Producer)','+34 941 14 23 90','www.pastordiaz.com' ),
( 'record','record','Cava Amorany Brut Gran Cuvée','Spain','Cataluña','Cava	','Vid Vica SL (Producer)','+34 933 10 22 62','www.vidvica.com' ),
( 'record','record','Champagne Bauchet Millésime Premier Cru 2009','France','Champagne','Champagne	','Champagne Bauchet (Producer)','+33 3 26 28 17 72; +33 3 26 58 17 72','www.champagne-bauchet.com' ),
( 'record','record','Champagne Fresne Ducret La Grande Hermine 2009','France','Champagne','Champagne	','Champagne Fresne Ducret/SCEV JA MILAUR (Producer)','+33 3 26 49 24 60','www.champagne-fresne-ducret.com' ),
( 'record','record','Champagne Jean Dumangin Brut Millésime 2009','France','Champagne','Champagne	','EARL Jean Dumangin (Producer)','+33 3 26 03 42 17','www.champagne-jean-dumangin.fr' ),
( 'record','record','Champagne Marc Perla-Rosa 2012','France','Champagne','Champagne	','Champagne Marc (Producer)','+33 3 26 58 46 88','www.champagne-marc.fr' ),
( 'record','record','Champagne Vincent Lamoureux Rosé','France','Champagne','Champagne	','EARL Lamoureux Vincent (Producer)','+33 3 25 29 39 32','www.champagne-lamoureux-vincent.fr' ),
( 'record','record','Château La Fleur Peyrabon 2014','France','Bordeaux','Pauillac	','Château Peyrabon (Producer)','+33 5 56 59 57 10','www.chateau-peyrabon.com' ),
( 'record','record','Château Moulin du Terrier 2016','France','Bordeaux','Bordeaux Rouge	','GAEC Forcato (Producer)','+33 6 71 21 57 71','www.chateau-lary.com' ),
( 'record','record','Château Ventenac Carla Rosé 2016','France','Languedoc-Roussillon','Cabardès	','Maison Ventenac (Producer)','+33 4 68 24 93 42','www.maisonventenac.fr' )
@tcbuzor

This comment has been minimized.

Copy link
Collaborator Author

tcbuzor commented Mar 26, 2018

@tcbuzor

This comment has been minimized.

Copy link
Collaborator Author

tcbuzor commented Mar 26, 2018

@OpenRefine OpenRefine deleted a comment from codacy-bot Mar 26, 2018

@@ -134,5 +137,13 @@ public String getCreateSQL() {
}
return createSQL;
}

// public static void main(String[] args) {

This comment has been minimized.

@thadguidry

thadguidry Mar 26, 2018

Member

Remove unused commented code please.

@thadguidry thadguidry requested review from ettorerizza and jackyq2015 Mar 26, 2018

@thadguidry thadguidry added this to the 3.0 milestone Mar 26, 2018

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Mar 27, 2018

@tcbuzor I think this is feature complete now from my side.
@jackyq2015 @ettorerizza could you test locally also and give it a review also ?

public void endFile() {
try {
if (columnNames.isEmpty()) {
writer.write(NO_COL_SELECTED_ERROR);

This comment has been minimized.

@jackyq2015

jackyq2015 Mar 30, 2018

Contributor

Should this be controlled from the front end? We can have the validation at back end, but we may not need write the error into the dump file.

This comment has been minimized.

@tcbuzor

tcbuzor Mar 31, 2018

Author Collaborator

There is front end validation as well if you try to generate sql with no column selected. I can suppress the error dump but if this happens, the user will see an empty file.

sql.append("DROP TABLE " + table + ";\n");
}

sql.append("CREATE TABLE ").append(table);

This comment has been minimized.

@jackyq2015

jackyq2015 Mar 30, 2018

Contributor

Not sure is there any database type required here? Oracle or Mysql?

This comment has been minimized.

@tcbuzor

tcbuzor Mar 31, 2018

Author Collaborator

This should work for MySQL, Oracle, PgSql and MariaDB

@ettorerizza

This comment has been minimized.

Copy link
Member

ettorerizza commented Mar 30, 2018

Some observations:

  • DROP TABLE triggers an error if the table does not exist. Ideally, DROP TABLE IF EXISTS should be used, but I'm not sure it's standard SQL.

  • The SQL script exported produces an error in the database ("all VALUES must have the same number of terms") if the OR project contains null (try with csv attached). Possible solution: transform all null into empty strings just before export.

  • The system does not handle records (one to many relationship). In my example, it would ideally be necessary for the column "instanceof", which is multivalued, to become a separate table linked to the main one by a foreign key. If it's too complicated, then the ideal would be to perform a "fill down ALL" just before the export, in order to get a correct denormalized table. But this function does not exist yet in Open Refine (the fill/blank down is also imperfect and to be used with caution).

wikidata.zip

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Mar 30, 2018

@ettorerizza I would rather not force opinion but think we should instead give the user an option to "Possible solution: transform all null into empty strings just before export."

The "handle records as relationships" for SQL export would be an enhancement, and THAT would have a multitude of options since users differ on their needs. I speak from experience with this.

@ettorerizza So... Go ahead and open a new enhancement issue for your design thoughts on "handle records as relationships" and provide drawings even if you can about how you ideally would see an interface to deal with that and we can have @tcbuzor Tony work on that in a future enhancement...and ideally once we get our front / backend separated.

@ettorerizza

This comment has been minimized.

Copy link
Member

ettorerizza commented Mar 30, 2018

I would rather not force opinion but think we should instead give the user an option

@thadguidry I agree that the user should have the choice of how he treats nulls. But for the moment, he has absolutely no choice. If a cell is null, the SQL script will not work.

In other words, if you have this:

screenshot-127 0 0 1-3333-2018 03 30-18-41-41

The export will produce this, which is incorrect.

CREATE TABLE clipboard (
Col1 VARCHAR(255),
Col2 VARCHAR(255),
Col3 VARCHAR(255)
);
INSERT INTO clipboard (Col1,Col2,Col3) VALUES 
( 'A','B','B' ),
( 'C','D' ), <------------- incorrect
( 'D','E','E' )

Go ahead and open a new enhancement issue for your design thoughts

I will try to draw that. The opinion of @gobertm could be interesting on this question.

@wetneb

This comment has been minimized.

Copy link
Member

wetneb commented Apr 4, 2018

@tcbuzor the bug reported by @ettorerizza above looks quite important (congrats to him for finding it!), I think it should be addressed before merging. Support for records would be nice but is beyond the scope of this PR I think.

@tcbuzor

This comment has been minimized.

Copy link
Collaborator Author

tcbuzor commented Apr 7, 2018

@wetneb , @ettorerizza I am working on the reported bug. I will add an option to "convert null cell to empty string" that way a valid sql insert statement will be generated. This will work for VARCHAR, TEXT and CHAR types only. Need to decide what to do when the user selects a numeric type. Return an error maybe?

@wetneb

This comment has been minimized.

Copy link
Member

wetneb commented Apr 8, 2018

@tcbuzor or you could just change the table schema so that null values are accepted in all columns, maybe? SQL has support for NULL, so why not use it?

@gobertm

This comment has been minimized.

Copy link
Contributor

gobertm commented Apr 11, 2018

@ettorerizza Sorry for late reply . Didn't see the notification.

I think the best solution is to insert null.

INSERT INTO clipboard (Col1,Col2,Col3) VALUES 
( 'A','B','B' ),
( 'C',null,'D' ),
( 'D','E','E' )

This will be correct in regards of the input. Replacing by an empty string is not the same. An empty string is not null.
If I perform a Select * from clipboard where Col2 is null I wont' get the line if it's replaced with an empty string.

Moreover it may be useful to add a "Not null" option in the export columns options. Because by default a CREATE TABLE t ( Score INT(2) ); allows null values. It will therefore be CREATE TABLE t ( Score INT(2) NOT NULL); or even allowing the user to enter a default value in case of undesired null values in the input CREATE TABLE t ( Score INT(2) NOT NULL DEFAULT 0 );

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Apr 18, 2018

@tcbuzor Just insert a null, as discussed above. We all upvoted.

TonyO added some commits Apr 22, 2018

@tcbuzor

This comment has been minimized.

Copy link
Collaborator Author

tcbuzor commented Apr 27, 2018

Just checked in the latest export code. I added 2 columns(Allow Null , Default Value). If the user indicates the column is not Nullable then he has the option of adding a default value, which will be used in the insert statement when the field value is NULL.
screen shot 2018-04-26 at 8 57 57 pm

@thadguidry

This comment has been minimized.

Copy link
Member

thadguidry commented Apr 27, 2018

@jackyq2015 I'll need your eyes on this as final verify, please. Take a look. Thanks.

@codacy-bot

This comment has been minimized.

Copy link

codacy-bot commented Apr 27, 2018

Codacy Here is an overview of what got changed by this pull request:

Issues
======
- Added 13
           

Complexity increasing per file
==============================
- main/src/com/google/refine/exporters/CustomizableTabularExporterUtilities.java  12
- main/src/com/google/refine/commands/project/ExportRowsCommand.java  4
         

Clones added
============
- main/webapp/modules/core/scripts/project/exporters.js  1
         

See the complete overview on Codacy

@OpenRefine OpenRefine deleted a comment from codacy-bot Apr 27, 2018

@jackyq2015 jackyq2015 merged commit 0eb7b40 into OpenRefine:master May 16, 2018

0 of 2 checks passed

Codacy/PR Quality Review Not up to standards. This pull request quality could be better.
Details
continuous-integration/appveyor/pr AppVeyor was unable to build non-mergeable pull request
Details

@thadguidry thadguidry added this to TODO in Google Sponsored Projects via automation May 26, 2018

@thadguidry thadguidry moved this from TODO to Done in Google Sponsored Projects May 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment