Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when importing vertices and edges using ETL #6574

Closed
1 task
deeptip opened this issue Aug 17, 2016 · 2 comments
Closed
1 task

Issue when importing vertices and edges using ETL #6574

deeptip opened this issue Aug 17, 2016 · 2 comments
Assignees

Comments

@deeptip
Copy link

deeptip commented Aug 17, 2016

OrientDB Version, operating system, or hardware.

  • v2.2.5

Operating System

  • Linux

Expected behavior and actual behavior

When importing vertices and edges using ETL, it gives a timeout exception after loading some vertices and edges.

Detailed Error:

- Year.csv:
image

- Month.csv:
image

- ETL JSON for Year.csv:
{
"source": {
"file": {
"path": "../testdb/data/year.csv"
}
},
"extractor": {
"row": {

    }
},
"transformers": [{
    "csv": {
        "separator": ",",
        "skipFrom": 1,
        "skipTo": 0,
        "nullValue": "Null",
        "columnsOnFirstLine": true
    }
},
{
    "vertex": {
        "class": "Year"
    }
}
],
"loader": {
    "orientdb": {
        "dbURL": "remote:1.2.3.4/testdb",
        "dbUser": "orientdb",
        "dbPassword": "orientdb",
        "serverUser": "orientdb",
        "serverPassword": "orientdb",
        "dbType": "graph",
        "classes": [{
            "name": "Year",
            "extends": "V"
        }],
        "indexes": [
            {"class":"Year", "fields":["year:string"], "type":"UNIQUE" }
        ]
    }
}

}

- ETL JSON for Month:

{
"config": {
"log":"debug",
"parallel":true
},
"source": {
"file": {
"path": "../testdb/data/month.csv"
}
},
"extractor": {
"row": {
}
},
"transformers": [{
"csv": {
"separator": ",",
"skipFrom": 1,
"skipTo": 0,
"nullValue": "Null",
"columnsOnFirstLine": true
}
},
{
"vertex": {
"class": "Month"
}
},
{
"edge": {
"class":"BelongsTo",
"joinFieldName":"year",
"lookup":"Year.year",
"direction":"out",
"unresolvedLinkAction":"WARNING"
}
}],
"loader": {
"orientdb": {
"dbURL": "remote:1.2.3.4/testdb",
"dbUser": "orientdb",
"dbPassword": "orientdb",
"serverUser": "orientdb",
"serverPassword": "orientdb",
"dbType": "graph",
"classes": [{
"name": "Month",
"extends": "V"
}]
}
}
}

- ETL JSON for year.csv was loaded properly.

- Error when loading ETL JSON for month.csv is as follows:
*C:\orientdb-community-2.2.5\bin>oetl.bat ..\testdb\etl\loadMonth.json
OrientDB etl v.2.2.5 (build 2.2.x@r393af9c5a3e4a4408440a9376283a26d2d3d3c7b; 2016-07-20 06:03:46+0000) www.orientdb.com
BEGIN ETL PROCESSOR
[file] INFO Reading from file ../testdb/data/month.csv with encoding UTF-8
Started execution with 1 worker threads

  • extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 1000ms [0 warnings, 0 errors]
  • extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 2s [0 warnings, 0 errors]
  • extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 3s [0 warnings, 0 errors]
    [orientdb] DEBUG - OrientDBLoader: created vertex class 'Month' extends 'V'
    [orientdb] DEBUG orientdb: found 0 vertices in class 'null'
  • extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 4s [0 warnings, 0 errors]
  • extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 5s [0 warnings, 0 errors]
    Start extracting
    [0:csv] DEBUG Transformer input: id_month,month,quarter,year
    [0:csv] DEBUG parsing=id_month,month,quarter,year
    [0:csv] DEBUG Transformer output: null

2016-08-17 12:56:22:964 WARNI Transformer [csv] returned null, skip rest of pipeline execution [OETLPipeline][1:csv] DEBUG Transformer input: 1,January,JFM,2010
[1:csv] DEBUG parsing=1,January,JFM,2010
[1:csv] DEBUG document={id_month:1,month:January,quarter:JFM,year:2010}
[1:csv] DEBUG Transformer output: {id_month:1,month:January,quarter:JFM,year:2010}
[1:vertex] DEBUG Transformer input: {id_month:1,month:January,quarter:JFM,year:2010}
[1:vertex] DEBUG Transformer output: v(Month)[#45:0]
[1:edge] DEBUG Transformer input: v(Month)[#45:0]

Steps to reproduce the problem

1. Import JSON for year.csv using ETL (Loads without error)
2. Import JSON for month.csv using ETL (loads some vertices and edges and then throws error after about 64000ms)

@robfrank
Copy link
Contributor

hi, could you try to load months.csv with parallel set to false?

@robfrank
Copy link
Contributor

I'm closing, feel free to comment or report again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants