Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use Lucene with ETL (need documentation improvement) ? #8222

Closed
PhML opened this issue Apr 18, 2018 · 0 comments
Closed

How to use Lucene with ETL (need documentation improvement) ? #8222

PhML opened this issue Apr 18, 2018 · 0 comments

Comments

@PhML
Copy link

PhML commented Apr 18, 2018

OrientDB Version: 2.2.33-spatial from dockerhub

OS: linux

Expected behavior

Insert spatial data in database using ETL.

Actual behavior

Error on ETL execution.

I know it isn't a issue per se but I think it can be a reference for working on documentation improvement.

Steps to reproduce

I try to follow the example of @saeedtabrizi in this comment to insert geospatial data from the following CSV file:

OACI;LAT;LONG
LFMH;45,53;4,30
LFLL;45,73;5,08
LFLP;45,93;6,11
LFLS;45,36;5,33
LFKJ;41,92;8,80
LFLC;45,79;3,16
LFSD;47,27;5,09
LFKF;41,50;9,10
LFLB;45,64;5,88
LFMN;43,67;7,22
LFBX;45,20;0,82
LFOH;49,53;0,09
LFCR;44,41;2,48
LFCI;43,91;2,12
LFRB;48,45;-4,42
LFQQ;50,56;3,09
LFKB;42,55;9,48
LFKC;42,52;8,79
LFBV;45,15;1,47
LFEY;46,72;-2,39

Here is my json file:

{
  "config": {
    "log": "debug",
    "fileDirectory": "/tmp/data/",
    "fileName": "sample.csv"
  },
  "source": {
    "file": {
      "path": "/tmp/data/sample.csv"
    }
  },
  "extractor": {
    "csv": {
      "separator": ";",
      "nullValue": ""
    }
  },
  "transformers": [
    {
      "vertex": {
        "class": "Aeroport"
      }
    },
    {
      "field": {
        "fieldName": "location",
        "expression": "'POINT('+ $input.LONG + ' ' + $input.LAT + ')'"
      }
    },
    {
      "field": {
        "fieldName": "location",
        "expression": "St_GeomFromText(location)"
      }
    }
  ],
  "loader": {
    "orientdb": {
      "dbURL": "plocal:/orientdb/databases/SAMPLE",
      "serverPassword": "root",
      "dbType": "graph",
      "dbAutoDropIfExists": true,
      "dbAutoCreateProperties": true,
      "classes": [
        {
          "name": "Aeroport",
          "extends": "V"
        }
      ],
      "indexes": [
        {
          "class": "Aeroport",
          "fields": [
            "OACI:string"
          ],
          "type": "UNIQUE"
        }
      ]
    }
  }
}

And I get this error:

2018-04-18 19:29:49:712 SEVER {db=SAMPLE} ETL process has problem: java.util.concurrent.ExecutionException: com.orientechnologies.orient.core.exception.OSchemaException: Document belongs to abstract class 'OPoint' and cannot be saved
        DB name="SAMPLE"Uncaught exception in thread 'pool-2-thread-1'
com.orientechnologies.orient.core.exception.OSchemaException: Document belongs to abstract class 'OPoint' and cannot be saved
        DB name="SAMPLE"
        at com.orientechnologies.orient.core.tx.OTransactionAbstract.getClusterName(OTransactionAbstract.java:241)
        at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveNew(OTransactionNoTx.java:245)
        at com.orientechnologies.orient.core.tx.OTransactionNoTx.saveRecord(OTransactionNoTx.ja

And the only record inserted is:

{
    "result": [
        {
            "@type": "d",
            "@rid": "#26:0",
            "@version": 1,
            "@class": "Aeroport",
            "OACI": "LFMH",
            "LAT": 45.53,
            "LONG": 4.3,
            "@fieldTypes": "LAT=f,LONG=f"
        }
    ],
    "notification": "Query executed in 0.89 sec. Returned 1 record(s)"
}

I tried to first create the class with an embedded property in Studio (as in the lucene part of the documentation):

CREATE class Aeroport
CREATE PROPERTY Aeroport.location EMBEDDED OPoint

But even if I logout from studio, when I run the ETL I get an error about concurrent access to the database (that might be the object of another issue):

2018-04-18 20:11:26:785 INFO  OrientDB auto-config DISKCACHE=2,972MB (heap=455MB direct=524,288MB os=3,940MB)Exception `13B6AECC` in storage `plocal:/orientdb/databases/SAMPLE`: 2.2.33 (build
 77584cd6827f647cf4aa231cf27bd6f10bc04e2c, branch 2.2.x)
com.orientechnologies.orient.core.exception.OStorageException: Cannot open local storage '/orientdb/databases/SAMPLE' with mode=rw
        DB name="SAMPLE"
        at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.open(OAbstractPaginatedStorage.java:323)
        at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.open(ODatabaseDocumentTx.java:259)
        at com.orientechnologies.orient.core.db.OPartitionedDatabasePool$DatabaseDocumentTxPooled.internalOpen(OPartitionedDatabasePool.java:450)
        at com.orientechnologies.orient.core.db.OPartitionedDatabasePool.openDatabase(OPartitionedDatabasePool.java:311)
        at com.orientechnologies.orient.core.db.OPartitionedDatabasePool.acquire(OPartitionedDatabasePool.java:269)
        at com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.<init>(OrientBaseGraph.java:145)
        at com.tinkerpop.blueprints.impls.orient.OrientGraphNoTx.<init>(OrientGraphNoTx.java:65)
        at com.tinkerpop.blueprints.impls.orient.OrientGraphFactory$2.getGraph(OrientGraphFactory.java:117)
        at com.tinkerpop.blueprints.impls.orient.OrientGraphFactory.getNoTx(OrientGraphFactory.java:241)
        at com.orientechnologies.orient.etl.loader.OOrientDBLoader.configureGraphDB(OOrientDBLoader.java:421)
        at com.orientechnologies.orient.etl.loader.OOrientDBLoader.configure(OOrientDBLoader.java:347)
        at com.orientechnologies.orient.etl.OETLProcessor.configureComponent(OETLProcessor.java:472)
        at com.orientechnologies.orient.etl.OETLProcessor.configureLoader(OETLProcessor.java:292)
        at com.orientechnologies.orient.etl.OETLProcessor.parse(OETLProcessor.java:222)
        at com.orientechnologies.orient.etl.OETLProcessor.parse(OETLProcessor.java:187)
        at com.orientechnologies.orient.etl.OETLProcessor.parseConfigAndParameters(OETLProcessor.java:155)
        at com.orientechnologies.orient.etl.OETLProcessor.main(OETLProcessor.java:119)
Caused by: com.orientechnologies.orient.core.exception.OStorageException: Cannot open storage it is acquired by other process

I tried to use a command block in begin:

  "begin": [
    {
      "console": {
        "commands": [
          "CONNECT plocal:/orientdb/databases/SAMPLE admin admin",
          "CREATE class Aeroport",
          "CREATE PROPERTY Aeroport.location EMBEDDED OPoint"
        ]
      }
    }
  ],

I got an error about missing semicolon:

orientdb> CONNECT plocal:/orientdb/databases/SAMPLE admin admin CREATE class Aeroport CREATE PROPERTY Aeroport.location EMBEDDED OPoint

!Wrong syntax. If you're running in batch mode make sure all commands are delimited by semicolon (;) or a linefeed (\n). Expected:

CONNECT <url> <user> [<password>]

WHERE:

* url               The url of the remote server or the database to connect to in the format '<mode>:<path>'
* user              User name
* password          User password (optional)

!Unrecognized command: 'CONNECT plocal:/orientdb/databases/SAMPLE admin admin CREATE class Aeroport CREATE PROPERTY Aeroport.location EMBEDDED OPoint'

OK but in the documentation there is no semicolon!
Documentation:

{ 
   "console": {
      "commands": [
         "CONNECT plocal:/temp/db/mydb admin admin",
         "INSERT INTO Account set name = 'Luca'"
      ]
  }
}

Even after adding semicolons:

orientdb {db=SAMPLE}> CREATE class Aeroport
Class created successfully. Total classes in database now: 20.
orientdb {db=SAMPLE}> CREATE PROPERTY Aeroport.location EMBEDDED OPoint
Property created successfully with id=1.
[file] INFO Load from file /tmp/data/sample.csv
2018-04-18 20:33:43:451 INFO  BEGIN ETL PROCESSOR
OrientDB console v.2.2.33 (build 77584cd6827f647cf4aa231cf27bd6f10bc04e2c, branch 2.2.x) https://www.orientdb.com
Type 'help' to display all the supported commands.
orientdb {db=SAMPLE}> CONNECT plocal:/orientdb/databases/SAMPLE admin admin
Disconnecting from the database [SAMPLE]...OK
Connecting to database [plocal:/orientdb/databases/SAMPLE] with user 'admin'...OK
orientdb {db=SAMPLE}> CREATE class Aeroport
Error: com.orientechnologies.orient.core.exception.OSchemaException: Class 'Aeroport' already exists in current database
        DB name="SAMPLE"
[file] INFO Reading from file /tmp/data/sample.csv with encoding UTF-8
2018-04-18 20:33:43:473 INFO  {db=SAMPLE} Started execution with 1 worker threads[orientdb] DEBUG orientdb: found 0 vertices in class 'null'
[orientdb] DEBUG orientdb: found metadata field 'null'
[orientdb] DEBUG - OrientDBLoader: created property 'Aeroport.OACI' of type: string
[orientdb] DEBUG - OrientDocumentLoader: created index 'Aeroport.OACI' type 'UNIQUE' against Class 'Aeroport', fields [OACI:string]
2018-04-18 20:33:43:700 SEVER {db=SAMPLE} ETL process has problem: java.lang.IllegalArgumentException: Type error. The class 'Aeroport' does not extend class 'V' and therefore cannot be considered a Vertex
2018-04-18 20:33:43:701 INFO  {db=SAMPLE} END ETL PROCESSOR
2018-04-18 20:33:43:702 INFO  {db=SAMPLE} + extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 250ms [0 warnings, 0 errors]

How should I proceed?

@laa laa closed this as completed Aug 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants