Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Caravel with Vertica #1193

Closed
drorata opened this issue Sep 26, 2016 · 15 comments · Fixed by #2581
Closed

Use Caravel with Vertica #1193

drorata opened this issue Sep 26, 2016 · 15 comments · Fixed by #2581

Comments

@drorata
Copy link

drorata commented Sep 26, 2016

I'm trying to connect Caravel to a Vertica cluster. The steps I took are:

  1. Install sqlalchemy-vertica-python and psycopg2.
  2. I configured a DB: vertica+vertica_python://dba:psss@host.name:5433/DB.schema. Testing the connection seems to be OK.
  3. Next, I try to create a table and this is where I fail. Can someone give me a hint what should be in each field? In particular Table Name (is it free text or refers to a table on Vertica?) and Schema? Almost all my attempts end with Table doesn't seem to exist in the specified database, couldn't fetch column information red message.

Thanks

@xrmx
Copy link
Contributor

xrmx commented Sep 26, 2016

Have you read the tutorial? http://airbnb.io/caravel/tutorial.html
The tables you add are the tables already present in the database you want to read from caravel

@xrmx xrmx added the question label Sep 26, 2016
@drorata
Copy link
Author

drorata commented Sep 26, 2016

Yes, I'm getting stuck at the definition of the table. BTW, I manage to connect to Vertica using SQLAlchemy from the console.

It seems like the table that Carvel identifies automatically have nothing to do with those that are actually available on Vertica

@xrmx
Copy link
Contributor

xrmx commented Sep 26, 2016

Possibly your tables are on a different schema?

@drorata
Copy link
Author

drorata commented Sep 26, 2016

I'm afraid I'm a little lost in naming conventions. In order to connect, I use something of this sort:

vertica+vertica_python://dba:psss@host.name:5433/DB.schema

Inside DB.schema I have a schema called db and inside it there are the tables I need. Does it make sense?

@LAlbertalli
Copy link
Contributor

@drorata
I'm a little bit lost on which is your problem. As a heads up, we are using Vertica in production with the OBDC driver, so our connection string looks like vertica+pyodbc://DB The Vertica SQLAlchemy driver for ODBC is no longer supported, you could see our patch here https://github.com/shopkick/vertica-sqlalchemy but we are removing Vertica so we are not supporting this patch.

For your specific problem, I will try to connect just to DB instead of DB.schema and I will specify the schema inside the Table configuration in Caravel. At least Vertica with the ODBC driver doesn't work well with schemas in the string. Let me know if removing the schema from the connection string and configuring it in the Tables help.

@drorata
Copy link
Author

drorata commented Sep 27, 2016

I'm also a bit confused :(

It seems like the connection to Vertica is OK (I'm still using the sqlalchemy-vertica-python since with your suggestion I failed to connet). After I connect, it suggest some 3 tables which I am not familiar with in the cluster.

When I try to set manually a table, I tried to fill the Table Name field in the form with the name of the table I need returned by SELECT table_name FROM v_catalog.all_tables WHERE schema_name='db' and table_name='my_table'. I filled the Schema field with the value returned by SELECT schema_name FROM v_catalog.schemata where schema_name='my_schema'.

At that point it returns the following error:

(vertica_python.errors.MissingSchema) Severity: ERROR, Message: Schema "pg_catalog" does not exist, Sqlstate: 3F000, Routine: RangeVarGetObjid, File: /scratch_a/release/svrtar28745/vbuild/vertica/Catalog/Namespace.cpp, Line: 288, SQL: u" SELECT c.oid FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE (n.nspname = 'db') AND c.relname = 'my_table' AND c.relkind in ('r', 'v', 'm', 'f') " [SQL: "\n SELECT c.oid\n FROM pg_catalog.pg_class c\n LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace\n WHERE (n.nspname = :schema)\n AND c.relname = :table_name AND c.relkind in ('r', 'v', 'm', 'f')\n "] [parameters: {'table_name': u'my_table', 'schema': 'my_schema'}]

@LAlbertalli
Copy link
Contributor

The error returned now make sense to me. The driver you are using is Buggy.

When you add a table, Caravel uses SQLAlchemy reflection capabilities to create a model of the Table you just added. Behind the scenes, SQLAlchemy uses the driver to get the table schema. As you could see from the error message it is trying to inspect the table schema using the table pg_catalog, that is Postgres, instead of v_catalog.

Vertica uses PostgreSQL syntax but changed the table name for the schema. The driver you are using still uses the Postgres function to perform the schema reflection and it fails. You probably need to patch the driver. In my case was just one function missing. Since the driver you are using comes from the same ancestor that I used, maybe my patch could help you. Try to look at the diff if it helps. jamescasbon/vertica-sqlalchemy@master...shopkick:master

@drorata
Copy link
Author

drorata commented Sep 28, 2016

This was helpful! Even though your patch didn't do the trick for me. Here's my version:

    @reflection.cache
    def get_unique_constraints(self, connection, table_name, schema=None, **kw):

        query = None
        if schema is not None:
            query = "select constraint_id, constraint_name, column_name from v_catalog.constraint_columns \n\
            WHERE table_name = '" + table_name + "' AND table_schema = '" + schema + "'"
        else:
            query = "select constraint_id, constraint_name, column_name from v_catalog.constraint_columns \n\
            WHERE table_name = '" + table_name + "'"

        rs = connection.execute(query)

        unique_names = {row[1] for row in rs}

        result_dict = {unique: [] for unique in unique_names}
        for row in rs:
            result_dict[row[1]].append(row[2])

        result = []
        for key in result_dict.keys():
            result.append(
                {"name": key,
                 "column_names": result_dict[key]}
            )

        return result

@drorata
Copy link
Author

drorata commented Sep 28, 2016

Here's my fork: drorata/sqlalchemy-vertica-python@40defc7

@dennisobrien
Copy link
Contributor

@drorata Thanks for the fix to sqlalchemy-vertica-python -- it fixed it for me. If you haven't already, it would be great if you could submit this fix as a PR to https://github.com/LocusEnergy/sqlalchemy-vertica-python

Thanks for your work!

@drorata
Copy link
Author

drorata commented Sep 29, 2016

Done LocusEnergy/sqlalchemy-vertica-python#1

@dany9988
Copy link

dany9988 commented Apr 3, 2017

@drorata Could you pls share steps to connect with Vertica from super-set? Am a newbie with superset and couldn't get enough material for this online.
Steps which i have done -

  1. Superset installation done.
  2. vertica_python,sqlalchemy_vertica_python & psycopg2 installed.

@drorata
Copy link
Author

drorata commented Apr 3, 2017

@dany9988 I'm sorry, I don't have Vertica at the moment, so I cannot reproduce the steps.

@dany9988
Copy link

dany9988 commented Apr 5, 2017

@drorata- Ok. Can you pls suggest whether you were able to add tables from vertica. Am stuck at that step. Getting Table [SchemaName.tablename] could not be found, please double check your database connection, schema, and table name.

@slash-cyberpunk
Copy link

slash-cyberpunk commented Apr 10, 2017

@dany9988 use SQLAlchemy URI: vertica+vertica_python://<user>:<password>@<host>:<port>/<db>
For help: http://docs.sqlalchemy.org/en/rel_1_0/core/engines.html#database-urls

@dpgaspar dpgaspar mentioned this issue Dec 9, 2019
12 tasks
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this issue Nov 17, 2021
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this issue Nov 24, 2021
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this issue Nov 25, 2021
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this issue Nov 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants