Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gitlw/migrate tool #3295

Merged
merged 64 commits into from May 10, 2019

Conversation

@gitlw
Copy link
Contributor

commented Apr 16, 2019

This change is Reviewable

gitlw added 30 commits Apr 5, 2019
Copy link
Member

left a comment

Reviewed 1 of 8 files at r1, 1 of 7 files at r4, 1 of 6 files at r5, 2 of 6 files at r6, 3 of 3 files at r7.
Reviewable status: all files reviewed, 54 unresolved discussions (waiting on @gitlw, @golangcibot, @mangalaman93, and @manishrjain)


dgraph/cmd/migrate/dump_meta.go, line 47 at r7 (raw file):

	columnNames    []string
	blankNodeLabel string
	//columnTypes    []DataType

why is this commented out?


dgraph/cmd/migrate/dump_meta.go, line 56 at r7 (raw file):

	for table, guide := range m.tableGuides {
		tableInfo := m.tableInfos[table]
		for _, index := range guide.indexG.genDgraphIndices(tableInfo) {

indexGen seems a more readable name to me.


dgraph/cmd/migrate/dump_meta.go, line 66 at r7 (raw file):

runs a topological sort 

where is this sorting happening? I don't see any sorting happening inside this function.


dgraph/cmd/migrate/dump_meta.go, line 101 at r7 (raw file):

	defer rows.Close()

	var rmi *RowMetaInfo

seems to me like the Info is unnecessary in this name. I am assuming Meta stands for metadata in which case the word already contains the idea this is some kind of information.

RowMeta should be enough.

nit: Also rmi could also be named meta. Just by looking at a line, I wouldn't have a clue what rmi is.

This also applies to other places where RowMetaInfo and rmi appear.


dgraph/cmd/migrate/dump_meta.go, line 118 at r7 (raw file):

		// step 2: output the column values in RDF format
		rmi.blankNodeLabel = tableGuide.blankNodeG.genBlankNode(tableInfo, colValues)

blankNodeGen is more descriptive.


dgraph/cmd/migrate/dump_meta.go, line 130 at r7 (raw file):

}

// dumpTable reads data from a table and sends generated RDF entries to the m.dataWriter

this comment needs to be updated.

Also, the logic in dumpTableConstraints and dumpTable seems very similar. Could both the table and the constraints be dumped in a single function and iterator pass?


dgraph/cmd/migrate/dump_meta.go, line 175 at r7 (raw file):

// this function will return the following tuple
// ([fname, lname], [VARCHAR, VARCHAR], [person_fname, person_lname])
func getRowMetaInfo(rows *sql.Rows, tableGuide *TableGuide,

getRowMeta should be enough here as well.


dgraph/cmd/migrate/dump_meta.go, line 240 at r7 (raw file):

			//logger.Printf("ignoring the constraint because of error when getting ref label: %+v\n",
			//err)
			return

why is this commented out?


dgraph/cmd/migrate/dump_meta.go, line 255 at r7 (raw file):

func outputPlainCell(blankNode string, dataType DataType, predName string,
	colValue interface{}, writer *bufio.Writer) {
	// each cell value should be stored under a predicate

nit: start comments with uppercase and end them with periods if they are a complete sentence.


dgraph/cmd/migrate/dump_meta.go, line 269 at r7 (raw file):

			//logger.Printf("ignoring object %v because of error when getting value: %v", colValue,
			//err)

why is this commented out?


dgraph/cmd/migrate/dump_meta.go, line 284 at r7 (raw file):

// Consider the foreign key constraint
// foreign key (person_company, person_employee_id) references person (company, employee_id)

blank key


dgraph/cmd/migrate/run.go, line 88 at r4 (raw file):

Previously, gitlw (Lucas Wang) wrote…

Done.

getTableInfo has pool as it's last argument.


dgraph/cmd/migrate/run.go, line 55 at r7 (raw file):

	flag.StringP("output_schema", "s", "", "The schema output file")
	flag.StringP("output_data", "o", "", "The data output file")

I think it's safe to give these two some default value.

You could also check if the passed values correspond to existing files before running the migration and exit early if that's the case if you aren't doing so now.


dgraph/cmd/migrate/run.go, line 67 at r7 (raw file):

	if len(mysqlUser) == 0 {
		logger.Fatalf("the mysql_user property should not be empty")

The log messages should be complete sentences (start with uppercase, end with period).

Same for all the other Fatalf calls.


dgraph/cmd/migrate/table_guide.go, line 66 at r3 (raw file):

Previously, gitlw (Lucas Wang) wrote…

Done.

CounterBNGen would still be short and a bit more descriptive


dgraph/cmd/migrate/table_guide.go, line 32 at r7 (raw file):

// A BlankNodeG generates the unique blank node label that corresponds to a Dgraph uid.
// Values are passed to the genBlankNode method in the order of alphabetically sorted columns
type BlankNodeG interface {

BlankNodeGen is a more descriptive name. or BNGen to follow the other generators' naming convention.


dgraph/cmd/migrate/table_guide.go, line 37 at r7 (raw file):

// ColumnBNG generates blank node labels using values in the primary key columns
type ColumnBNG struct {

ColumnBNGen is a bit more descriptive.


dgraph/cmd/migrate/table_guide.go, line 217 at r7 (raw file):

// an IdxG is responsible for generating Dgraph indices
type IdxG interface {

I think idxGen is a slightly more descriptive name.


dgraph/cmd/migrate/table_guide.go, line 227 at r7 (raw file):

// person_lname: string @index(exact) .
// The reason is that Dgraph does not support composite indices as of now.
type CompositeIdxG struct {

same as above: CompositeIdxGen is more obvious to me.


dgraph/cmd/migrate/table_guide.go, line 271 at r7 (raw file):

// a PredNameG is responsible for generating pred names based on a table's info and a column name
type PredNameG interface {

PredNameGen seems like a clearer name to me.

I think just PredGen would work as well.


dgraph/cmd/migrate/table_guide.go, line 275 at r7 (raw file):

}

type SimplePredNameG struct {

SimplePredNameGen or SimplePredGen


dgraph/cmd/migrate/utils.go, line 75 at r7 (raw file):

// returns the indices of the columns satisfying the criteria function
func getColumnIndices(info *TableInfo,
	criteria criteriaFunc) []*ColumnIdx {

I think this fits in the previous line

@mangalaman93 mangalaman93 requested review from manishrjain and martinmr May 4, 2019
Copy link
Contributor

left a comment

Reviewed 1 of 7 files at r4.
Reviewable status: 6 of 7 files reviewed, 44 unresolved discussions (waiting on @gitlw, @golangcibot, @mangalaman93, @manishrjain, and @martinmr)


dgraph/cmd/migrate/datatype.go, line 20 at r8 (raw file):

const (
	UNKNOWN DataType = iota

Is there convention to keep the capital and exported? Could avoid that, seems unnecessary to export.


dgraph/cmd/migrate/table_guide.go, line 283 at r8 (raw file):

}

type TableGuide struct {

Would be useful to record what TableGuide does here. And not sure, why the type is exported. Check for other types too, we don't need to export any of these types from this package.


dgraph/cmd/migrate/table_info.go, line 110 at r4 (raw file):

Previously, gitlw (Lucas Wang) wrote…

Not sure what you mean by that. Do you mean the case where the table contains no columns and hence the iterator returns no rows? In that case, the columnRows.Next() would not return true.
Please let me know if you meant something else.

Yeah, I'm not sure now! What I tried for the firs time is to run the migration tool on information_schema tables. It failed around here with some exception. You might want to check that.


dgraph/cmd/migrate/table_info.go, line 65 at r8 (raw file):

	columns   map[string]*ColumnInfo

	columnDataTypes []DataType // used by the RowMetaInfo when outputing rows

This looks pretty strange to store it here, in the order that has no meaning in the TableInfo struct. Can we store it in the RowMetaInfo struct itself? Another option could be to store it in ColumnInfo and use the column name -> type mapping when needed to figure out the type.

@mangalaman93 mangalaman93 self-requested a review May 4, 2019
@mangalaman93

This comment has been minimized.

Copy link
Contributor

commented May 4, 2019

@gitlw I am still pretty lost after the point we start building TableGuides, can't follow the code afterwards. See if it is possible to simplify the code and add more comments.

gitlw and others added 3 commits May 6, 2019
Copy link
Member

left a comment

Gave my review and made some changes. Will leave it to @mangalaman93 and @martinmr for the final approval.

Reviewable status: 2 of 7 files reviewed, 75 unresolved discussions (waiting on @gitlw, @golangcibot, @mangalaman93, @manishrjain, and @martinmr)


dgraph/cmd/migrate/datatype.go, line 20 at r8 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

Is there convention to keep the capital and exported? Could avoid that, seems unnecessary to export.

In Go, they don't need to be cap case. They can be camelcase.


dgraph/cmd/migrate/datatype.go, line 34 at r9 (raw file):

func initDataTypes() {
	typeToString = make(map[DataType]string)
	typeToString[UNKNOWN] = "unknown"

Add a comment that this is for generating schema file.


dgraph/cmd/migrate/datatype.go, line 43 at r9 (raw file):

	mysqlTypePrefixToGoType = make(map[string]DataType)
	mysqlTypePrefixToGoType["int"] = INT

Add a comment that this is for parsing SQL schema.


dgraph/cmd/migrate/datatype.go, line 47 at r9 (raw file):

	mysqlTypePrefixToGoType["text"] = STRING
	mysqlTypePrefixToGoType["date"] = DATETIME
	mysqlTypePrefixToGoType["time"] = DATETIME

sqlTypeToInternal


dgraph/cmd/migrate/run.go, line 50 at r9 (raw file):

	flag := Migrate.Cmd.Flags()
	flag.StringP("mysql_user", "", "",

Remove the mysql prefix from all the options. Tomorrow we could extend this to Postgres and others.


dgraph/cmd/migrate/run.go, line 51 at r9 (raw file):

	flag := Migrate.Cmd.Flags()
	flag.StringP("mysql_user", "", "",
		"The MySQL user for logging in")

might fit in the same line. This and below.


dgraph/cmd/migrate/run.go, line 60 at r9 (raw file):

	flag.StringP("output_data", "o", "sql.rdf", "The data output file")
	flag.BoolP("quiet", "q", false, "Enable quiet mode to suppress "+
		"the warning logs")

move to line above.


dgraph/cmd/migrate/run.go, line 75 at r9 (raw file):

		logger.Fatalf("The mysql_user property should not be empty.")
	}
	if len(mysqlDB) == 0 {

Can use switch case for all these ifs.

switch {
case len(mysqlDB) == 0:
  ...
case len(password) == 0:
  ...

dgraph/cmd/migrate/run.go, line 98 at r9 (raw file):

	initDataTypes()

	pool, err := getMySQLPool(mysqlUser, mysqlDB, mysqlPassword)

getPool. Remove all reference to MySQL.


dgraph/cmd/migrate/run.go, line 130 at r9 (raw file):

// checkFileOW checks if the program is trying to output to an existing file.
// If so, we would need to ask the user whether we should overwrite the file or abort the program.
func checkFileOW(file string) error {

Remove the OW. It's not clear what it means.


dgraph/cmd/migrate/run.go, line 136 at r9 (raw file):

		for {
			fmt.Printf("overwriting the file %s (y/N)? ", file)
			text, _ := reader.ReadString('\n')

Is this an error? If so, either x.Check or return error.


dgraph/cmd/migrate/table_guide.go, line 217 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

I think idxGen is a slightly more descriptive name.

You can remove this interface.


dgraph/cmd/migrate/table_guide.go, line 227 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

same as above: CompositeIdxGen is more obvious to me.

You can remove this struct.


dgraph/cmd/migrate/table_guide.go, line 32 at r9 (raw file):

// A blankNodeGen generates the unique blank node label that corresponds to a Dgraph uid.
// Values are passed to the genBlankNode method in the order of alphabetically sorted columns
type blankNodeGen interface {

s/blankNode/generator


dgraph/cmd/migrate/table_guide.go, line 33 at r9 (raw file):

// Values are passed to the genBlankNode method in the order of alphabetically sorted columns
type blankNodeGen interface {
	genBlankNode(info *tableInfo, values []interface{}) string

blankNode

So, it would read gen.blankNode.


dgraph/cmd/migrate/table_info.go, line 63 at r9 (raw file):

// a tableInfo contains a SQL table's metadata such as the table name,
// the info of each column etc
type tableInfo struct {

type sqlTable struct


dgraph/cmd/migrate/table_info.go, line 101 at r9 (raw file):

	query := fmt.Sprintf(`select COLUMN_NAME,DATA_TYPE from INFORMATION_SCHEMA.
COLUMNS where TABLE_NAME = "%s" AND TABLE_SCHEMA="%s" ORDER BY COLUMN_NAME`, table, database)
	columnRows, err := pool.Query(query)

columns, err


dgraph/cmd/migrate/table_info.go, line 107 at r9 (raw file):

	defer columnRows.Close()

	tableInfo := &tableInfo{

table := &sqlTable{ ... }


dgraph/cmd/migrate/table_info.go, line 131 at r9 (raw file):

		*/
		var fieldName, dbType string
		err := columnRows.Scan(&fieldName, &dbType)

if err := ...; err != nil


dgraph/cmd/migrate/table_info.go, line 143 at r9 (raw file):

		tableInfo.columnDataTypes = append(tableInfo.columnDataTypes, getDataType(dbType))
	}

Sort the column names here using sort.Strings? Or confirm that SQL would have already given them in a sorted order.


dgraph/cmd/migrate/table_info.go, line 147 at r9 (raw file):

	indexQuery := fmt.Sprintf(`select INDEX_NAME,COLUMN_NAME from INFORMATION_SCHEMA.`+
		`STATISTICS where TABLE_NAME = "%s" AND index_schema="%s"`, table, database)
	indexRows, err := pool.Query(indexQuery)

indices, err


dgraph/cmd/migrate/table_info.go, line 162 at r9 (raw file):

			tableInfo.columns[columnName].keyType = PRIMARY
		default:
			tableInfo.columns[columnName].keyType = MULTI

What happens if we have a full-text search index on a column in SQL. Would it be considered multi? (It should not be)


dgraph/cmd/migrate/table_info.go, line 170 at r9 (raw file):

		REFERENCED_COLUMN_NAME from INFORMATION_SCHEMA.KEY_COLUMN_USAGE where TABLE_NAME = "%s"
        AND CONSTRAINT_SCHEMA="%s" AND REFERENCED_TABLE_NAME IS NOT NULL`, table, database)
	foreignKeyRows, err := pool.Query(foreignKeysQuery)

foreignKeys or fkeys (fkeys should be sufficient).


dgraph/cmd/migrate/table_info.go, line 186 at r9 (raw file):

		+---------------+-----------------+-----------------------+------------------------+
		*/
		var columnName, constraintName, referencedTableName, referencedColumnName string

col, constraint, dstTable, dstCol

or refTable, refCol


dgraph/cmd/migrate/table_info.go, line 193 at r9 (raw file):

		}

		tableInfo.referencedTables[referencedTableName] = struct{}{}

table.dstTables or table.refTables


dgraph/cmd/migrate/table_info.go, line 218 at r9 (raw file):

// col4, col5, col6 whose remote table name is A, and remote columns are
// col1, col2 and col3
func validateAndGetReverse(constraint *fkConstraint) (string, *fkConstraint) {

Remove this?


dgraph/cmd/migrate/table_info.go, line 243 at r9 (raw file):

// the data at tables[table name].foreignKeyReferences
// and stores them in tables[table name].columns[column name].referencedBy
func populateReferencedByColumns(tables map[string]*tableInfo) {

Remove?


dgraph/cmd/migrate/utils.go, line 34 at r9 (raw file):

func getMySQLPool(mysqlUser string, mysqlDB string, password string) (*sql.DB,
	error) {
	pool, err := sql.Open("mysql",

return sql.Open(...)


dgraph/cmd/migrate/utils.go, line 47 at r9 (raw file):

// 2) if the parameter is empty, this function will read all the tables under the given
// database from MySQL and then return the result
func readMySqlTables(pool *sql.DB, mysqlTables string) ([]string, error) {

showTables


dgraph/cmd/migrate/utils.go, line 117 at r9 (raw file):

func getColumnValues(columns []string, dataTypes []DataType,
	rows *sql.Rows) ([]interface{}, error) {
	colValuePtrs := make([]interface{}, 0, len(columns))

valuePtrs


dgraph/cmd/migrate/utils.go, line 137 at r9 (raw file):

		return nil, fmt.Errorf("error while scanning column values: %v", err)
	}
	colValues := ptrToValues(colValuePtrs)

ptrToValues can just be inlined. No one else seems to be calling it.


dgraph/cmd/migrate/utils.go, line 149 at r9 (raw file):

	}
	sort.Slice(columns, func(i, j int) bool {
		return columns[i] < columns[j]

Would make sense to sort the columns when you store them in tableInfo, before the object is passed around. Then, you can assume that the cols are sorted.

If the above is true, remove this func.


dgraph/cmd/migrate/dump_meta.go, line 42 at r9 (raw file):

// rowMeta captures values in a SQL table row, as well as the metadata associated
// with the row
type rowMeta struct {

sqlRow.

row := sqlRow{...}


dgraph/cmd/migrate/dump_meta.go, line 43 at r9 (raw file):

// with the row
type rowMeta struct {
	colValues      []interface{}

values


dgraph/cmd/migrate/dump_meta.go, line 223 at r9 (raw file):

// outputPlainCell sends to the writer a RDF where the subject is the blankNode
// the predicate is the predName, and the object is the colValue
func outputPlainCell(blankNode string, dataType DataType, predName string,

This can be part of the dumpMeta struct. Also, try to reuse the strings.Builder. Also, decrease the num args.

Copy link
Contributor Author

left a comment

Reviewable status: 1 of 32 files reviewed, 74 unresolved discussions (waiting on @golangcibot, @mangalaman93, @manishrjain, and @martinmr)


dgraph/cmd/migrate/datatype.go, line 20 at r8 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

Is there convention to keep the capital and exported? Could avoid that, seems unnecessary to export.

You are right that there is no need to export, but using the lower case word would result in conflicts with the keywords in golang, and I don't want to come up with other weird names.


dgraph/cmd/migrate/datatype.go, line 34 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Add a comment that this is for generating schema file.

Done.


dgraph/cmd/migrate/datatype.go, line 43 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Add a comment that this is for parsing SQL schema.

Done.


dgraph/cmd/migrate/datatype.go, line 47 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

sqlTypeToInternal

Done.


dgraph/cmd/migrate/run.go, line 55 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
	flag.StringP("output_schema", "s", "", "The schema output file")
	flag.StringP("output_data", "o", "", "The data output file")

I think it's safe to give these two some default value.

You could also check if the passed values correspond to existing files before running the migration and exit early if that's the case if you aren't doing so now.

Done.


dgraph/cmd/migrate/run.go, line 67 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

The log messages should be complete sentences (start with uppercase, end with period).

Same for all the other Fatalf calls.

Done.


dgraph/cmd/migrate/run.go, line 50 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Remove the mysql prefix from all the options. Tomorrow we could extend this to Postgres and others.

Done.


dgraph/cmd/migrate/run.go, line 51 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

might fit in the same line. This and below.

Done.


dgraph/cmd/migrate/run.go, line 60 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

move to line above.

Done.


dgraph/cmd/migrate/run.go, line 75 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Can use switch case for all these ifs.

switch {
case len(mysqlDB) == 0:
  ...
case len(password) == 0:
  ...

Done.


dgraph/cmd/migrate/run.go, line 98 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

getPool. Remove all reference to MySQL.

Done.


dgraph/cmd/migrate/run.go, line 130 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Remove the OW. It's not clear what it means.

Done.


dgraph/cmd/migrate/run.go, line 136 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Is this an error? If so, either x.Check or return error.

Done.


dgraph/cmd/migrate/table_guide.go, line 170 at r6 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

unreachable: unreachable code (from govet)

Done.


dgraph/cmd/migrate/table_guide.go, line 32 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

BlankNodeGen is a more descriptive name. or BNGen to follow the other generators' naming convention.

Done.


dgraph/cmd/migrate/table_guide.go, line 37 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

ColumnBNGen is a bit more descriptive.

Done.


dgraph/cmd/migrate/table_guide.go, line 217 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

I think idxGen is a slightly more descriptive name.

Done.


dgraph/cmd/migrate/table_guide.go, line 227 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

same as above: CompositeIdxGen is more obvious to me.

Done.


dgraph/cmd/migrate/table_guide.go, line 271 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

PredNameGen seems like a clearer name to me.

I think just PredGen would work as well.

Done.


dgraph/cmd/migrate/table_guide.go, line 275 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

SimplePredNameGen or SimplePredGen

Done.


dgraph/cmd/migrate/table_guide.go, line 283 at r8 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

Would be useful to record what TableGuide does here. And not sure, why the type is exported. Check for other types too, we don't need to export any of these types from this package.

Done.


dgraph/cmd/migrate/table_guide.go, line 32 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

s/blankNode/generator

Discussed about this offline and the comment no longer applies.


dgraph/cmd/migrate/table_guide.go, line 33 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

blankNode

So, it would read gen.blankNode.

Discussed about this offline and the comment no longer applies.


dgraph/cmd/migrate/table_info.go, line 110 at r4 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

Yeah, I'm not sure now! What I tried for the firs time is to run the migration tool on information_schema tables. It failed around here with some exception. You might want to check that.

I've tried with some sample data set and there is no exceptions now. Let me know if you are still seeing the problem.


dgraph/cmd/migrate/table_info.go, line 76 at r5 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: field fieldName is unused (from unused)

Done.


dgraph/cmd/migrate/table_info.go, line 77 at r5 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: field dataType is unused (from unused)

Done.


dgraph/cmd/migrate/table_info.go, line 65 at r8 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

This looks pretty strange to store it here, in the order that has no meaning in the TableInfo struct. Can we store it in the RowMetaInfo struct itself? Another option could be to store it in ColumnInfo and use the column name -> type mapping when needed to figure out the type.

I put it here since the column data types are retrieved from the information_schema.columns instead of from the table itself.
I've updated the PR so that the columnNames and predNames are also moved from rowMeta to the tableInfo.


dgraph/cmd/migrate/table_info.go, line 63 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

type sqlTable struct

Done.


dgraph/cmd/migrate/table_info.go, line 101 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

columns, err

Done.


dgraph/cmd/migrate/table_info.go, line 107 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

table := &sqlTable{ ... }

Done.


dgraph/cmd/migrate/table_info.go, line 131 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

if err := ...; err != nil

Done.


dgraph/cmd/migrate/table_info.go, line 143 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Sort the column names here using sort.Strings? Or confirm that SQL would have already given them in a sorted order.

Confirmed that SQL is already giving the column names in order through the order by clause.


dgraph/cmd/migrate/table_info.go, line 147 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

indices, err

Done.


dgraph/cmd/migrate/table_info.go, line 162 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

What happens if we have a full-text search index on a column in SQL. Would it be considered multi? (It should not be)

changed the name multi to "secondary", and all non-primary indices will have the type secondary.


dgraph/cmd/migrate/table_info.go, line 170 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

foreignKeys or fkeys (fkeys should be sufficient).

Done.


dgraph/cmd/migrate/table_info.go, line 186 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

col, constraint, dstTable, dstCol

or refTable, refCol

Done.


dgraph/cmd/migrate/table_info.go, line 193 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

table.dstTables or table.refTables

Done.


dgraph/cmd/migrate/table_info.go, line 218 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Remove this?

I double checked and the constraint source columns are still being used for building the mapping in the values recorder, so this should not be removed.


dgraph/cmd/migrate/table_info.go, line 243 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Remove?

I double checked and the constraint source columns are still being used for building the mapping in the values recorder, so this should not be removed.


dgraph/cmd/migrate/utils.go, line 75 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

I think this fits in the previous line

Done.


dgraph/cmd/migrate/utils.go, line 34 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

return sql.Open(...)

Done.


dgraph/cmd/migrate/utils.go, line 47 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

showTables

Done.


dgraph/cmd/migrate/utils.go, line 117 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

valuePtrs

Done.


dgraph/cmd/migrate/utils.go, line 137 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

ptrToValues can just be inlined. No one else seems to be calling it.

Done.


dgraph/cmd/migrate/utils.go, line 149 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Would make sense to sort the columns when you store them in tableInfo, before the object is passed around. Then, you can assume that the cols are sorted.

If the above is true, remove this func.

Done.


dgraph/cmd/migrate/dump_meta.go, line 101 at r4 (raw file):

Previously, gitlw (Lucas Wang) wrote…

Not really because it uses info from the "rows" iterator, which is only available after calling rows.Next()

Done.


dgraph/cmd/migrate/dump_meta.go, line 47 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

why is this commented out?

It's commented out since the column types have been moved into the TableInfo


dgraph/cmd/migrate/dump_meta.go, line 56 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

indexGen seems a more readable name to me.

Done.


dgraph/cmd/migrate/dump_meta.go, line 66 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
runs a topological sort 

where is this sorting happening? I don't see any sorting happening inside this function.

I've removed the topo sorting logic since there can be circles between the tables. Also I've updated the doc.


dgraph/cmd/migrate/dump_meta.go, line 101 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

seems to me like the Info is unnecessary in this name. I am assuming Meta stands for metadata in which case the word already contains the idea this is some kind of information.

RowMeta should be enough.

nit: Also rmi could also be named meta. Just by looking at a line, I wouldn't have a clue what rmi is.

This also applies to other places where RowMetaInfo and rmi appear.

Done.


dgraph/cmd/migrate/dump_meta.go, line 118 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

blankNodeGen is more descriptive.

Done.


dgraph/cmd/migrate/dump_meta.go, line 130 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

this comment needs to be updated.

Also, the logic in dumpTableConstraints and dumpTable seems very similar. Could both the table and the constraints be dumped in a single function and iterator pass?

The doc has been updated.

And they cannot be done with a single pass because generating RDF for a foreign key constraint requires looking up the blank node label in the remote table, which creates a dependency between tables. Plus the foreign key references can have circles. Hence for now the simplest approach is to go through all the tables once to record the mapping from foreign key columns to blank node labels, and then go through all of them one more time.


dgraph/cmd/migrate/dump_meta.go, line 175 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

getRowMeta should be enough here as well.

Done.


dgraph/cmd/migrate/dump_meta.go, line 240 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
			//logger.Printf("ignoring the constraint because of error when getting ref label: %+v\n",
			//err)
			return

why is this commented out?

because it's too verbose. I've added them back and also a flag --quiet to to turn these off.


dgraph/cmd/migrate/dump_meta.go, line 269 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
			//logger.Printf("ignoring object %v because of error when getting value: %v", colValue,
			//err)

why is this commented out?

ditto


dgraph/cmd/migrate/dump_meta.go, line 284 at r7 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

blank key

? can you please explain?


dgraph/cmd/migrate/dump_meta.go, line 42 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

sqlRow.

row := sqlRow{...}

Done.


dgraph/cmd/migrate/dump_meta.go, line 43 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

values

Done.


dgraph/cmd/migrate/dump_meta.go, line 223 at r9 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This can be part of the dumpMeta struct. Also, try to reuse the strings.Builder. Also, decrease the num args.

Done.

Copy link
Contributor Author

left a comment

Reviewable status: 1 of 32 files reviewed, 74 unresolved discussions (waiting on @golangcibot, @mangalaman93, @manishrjain, and @martinmr)


dgraph/cmd/migrate/datatype.go, line 20 at r8 (raw file):

Previously, gitlw (Lucas Wang) wrote…

You are right that there is no need to export, but using the lower case word would result in conflicts with the keywords in golang, and I don't want to come up with other weird names.

Done.

Copy link
Member

left a comment

:lgtm: Some extra comments regarding formatting and naming but otherwise the PR looks good.

Reviewed 1 of 6 files at r6, 31 of 31 files at r11.
Reviewable status: all files reviewed, 60 unresolved discussions (waiting on @gitlw, @golangcibot, @mangalaman93, and @manishrjain)


dgraph/cmd/migrate/run.go, line 65 at r11 (raw file):

Quoted 4 lines of code…
	mysqlUser := conf.GetString("user")
	mysqlDB := conf.GetString("db")
	mysqlPassword := conf.GetString("password")
	mysqlTables := conf.GetString("tables")

remove the mysql prefix from these variables as well.


dgraph/cmd/migrate/table_guide.go, line 143 at r11 (raw file):

			//logger.Printf("ignoring the constraint because of error when getting ref label: %+v",
			//cst)

this is still commented out. Should it be handled by the quite mode?

Also, add a new line if one is not included already by Printf


dgraph/cmd/migrate/utils.go, line 31 at r11 (raw file):

)

func getPool(mysqlUser string, mysqlDB string, password string) (*sql.DB,

also remove references to mysql in the argument names


dgraph/cmd/migrate/dump_meta.go, line 47 at r7 (raw file):

Previously, gitlw (Lucas Wang) wrote…

It's commented out since the column types have been moved into the TableInfo

maybe add a comment explaining why. Otherwise it looks like code that was left over.


dgraph/cmd/migrate/dump_meta.go, line 284 at r7 (raw file):

Previously, gitlw (Lucas Wang) wrote…

? can you please explain?

Sorry, I meant to say "blank line"

gitlw added 2 commits May 9, 2019
@gitlw gitlw merged commit 390d575 into master May 10, 2019
3 of 4 checks passed
3 of 4 checks passed
code-review/reviewable 5 files, 60 discussions left (gitlw, golangcibot, mangalaman93, manishrjain, martinmr)
Details
Blockade (dgraph) TeamCity build finished
Details
CI (dgraph) TeamCity build finished
Details
license/cla Contributor License Agreement is signed.
Details
@gitlw gitlw deleted the gitlw/migrate_tool branch May 10, 2019
dna2github added a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.