Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facing issue while using db #36

Closed
jayeshagwan1 opened this issue Dec 19, 2019 · 25 comments
Closed

Facing issue while using db #36

jayeshagwan1 opened this issue Dec 19, 2019 · 25 comments
Assignees

Comments

@jayeshagwan1
Copy link

I am using below command for sqlite db:

piicatcher db -s '/db/data' -t sqlite --scan-type shallow
where '/db/' is the path for data.sqlite file.

I am not sure what I am doing wrong. Getting below error.

image

If any detailed instructions are provided for db instances, it would help.

Thanks

@vrajat vrajat self-assigned this Dec 19, 2019
vrajat pushed a commit to vrajat/piicatcher that referenced this issue Dec 19, 2019
Database explorer dispatch was not working because dispatch was called
on the base class. To be able to cleanly use derived class dispatch,
Sqlite and other databases had to to be separated. So sqlite is a top
level option instead of being clubbed under 'db'

Sqlite tests were enabled and dispatch tests were fixed to reproduce the
bug.

Fix tokern#36
@vrajat vrajat closed this as completed in c802b11 Dec 19, 2019
@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

I've released a new version v0.6.2. One major change is that I had to split sqlite to keep the command line clean. Refer below. I'll work on revamping docs. Please follow #38

usage: piicatcher sqlite [-h] -s PATH [-f {ascii_table,json,db}]
                         [-c {deep,shallow}] [-o OUTPUT] [--list-all]

optional arguments:
  -h, --help            show this help message and exit
  -s PATH, --path PATH  File path to SQLite database
  -f {ascii_table,json,db}, --output-format {ascii_table,json,db}
                        Choose output format type
  -c {deep,shallow}, --scan-type {deep,shallow}
                        Choose deep(scan data) or shallow(scan column names
                        only)
  -o OUTPUT, --output OUTPUT
                        File path for report. If not specified, then report is
                        printed to sys.stdout
  --list-all            List all columns. By default only columns with PII
                        information is listed

@jayeshagwan1
Copy link
Author

jayeshagwan1 commented Dec 19, 2019

@vrajat Getting this error after v0.6.2. files also not working now

image

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

Can you check after running pip install pymssql ? The package is part of requirements and it should have been involved. I am not sure why it got missed. I'll see if I can reproduce it in the background.

@jayeshagwan1
Copy link
Author

Tried installing pymssql

image

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

More info: #29 (comment)
Moore's law. I'll release a new version with the right version.

@jayeshagwan1
Copy link
Author

For temporary fix I ran pip install "pymssql<3.0". Now Files are running fine. For db not able to get data:

image

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

OK. By default it only lists columns that has PII. You can list all columns using --list-all.

Shallow checks names of columns only. It looks for column names like name, email etc using regular expressions from https://github.com/madisonmay/CommonRegex. Do you have columns with these names ?

Another option is to also check data using -c deep

@jayeshagwan1
Copy link
Author

Yes I have such columns with names:

image

--list-all with -c deep

image

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

I recreated the table and it works for me:
Screenshot from 2019-12-19 15-45-10

Can you check if piicatcher can read all the tables in the sqlite database ? list all should not have returned an empty list. That points that it cannot find any tables and columns.

What are the data types of these columns ? piicatcher only checks text, varchar and char columns.

sqlite does not have user/roles/grant. So it cant be that.

Can you check if the following query returns any rows in sqlite3 ?

sqlite3 db.data.sqlite

sqlite>SELECT 
            "" as schema_name,
            m.name as table_name, 
            p.name as column_name,
            p.type as data_type
        FROM 
            sqlite_master AS m
        JOIN 
            pragma_table_info(m.name) AS p
        WHERE
            p.type like 'text' or p.type like 'varchar%' or p.type like 'char%'
        ORDER BY 
            m.name, 
            p.name

@jayeshagwan1
Copy link
Author

May be I am missing something
This is my sqlite

image

Anything you can see is not correct ?

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

Piicatcher should definitely detect these columns. I dont see anything wrong. Can you run this SQL query:

SELECT 
        "" as schema_name,
        m.name as table_name, 
        p.name as column_name,
        p.type as data_type
    FROM 
        sqlite_master AS m
    JOIN 
        pragma_table_info(m.name) AS p
    ORDER BY 
        m.name, 
        p.name

@jayeshagwan1
Copy link
Author

image

Any issues ?

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

Yes. data_type is null or empty. That is unexpected. I get

|test|address|varchar
|test|city|varchar
|test|company_name|varchar
|test|county|varchar
|test|email|varchar web
|test|first_name|varchar
|test|last_name|varchar
|test|phone1|varchar
|test|phone2|varchar
|test|state|varchar
|test|zip|varchar

What was your CREATE TABLE statement ? My guess is that you did not specify a type. I can reproduce with:

create table no_type(a,b);
# run select query
|no_type|a|
|no_type|b|

Can you recreate the table with data types ? For example:

create table test(first_name varchar, last_name varchar, company_name varchar, address varchar, city varchar, county varchar, state varchar, zip varchar, phone1 varchar, phone2 varchar, email varchar web);

@jayeshagwan1
Copy link
Author

I have script like this which has datatypes:

CREATE TABLE IF NOT EXISTS [userdata] (
[first_name] VARCHAR NULL,
[last_name] VARCHAR NULL,
[company_name] VARCHAR NULL,
[address] VARCHAR NULL,
[city] VARCHAR NULL,
[county] VARCHAR NULL,
[state] VARCHAR NULL,
[zip] INT NULL,
[phone1] VARCHAR NULL,
[phone2] VARCHAR NULL,
[email] VARCHAR NULL,
[web] VARCHAR NULL
);

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

I get the data type for that table.

|userdata|address|VARCHAR
|userdata|city|VARCHAR
|userdata|company_name|VARCHAR
|userdata|county|VARCHAR
|userdata|email|VARCHAR
|userdata|first_name|VARCHAR
|userdata|last_name|VARCHAR
|userdata|phone1|VARCHAR
|userdata|phone2|VARCHAR
|userdata|state|VARCHAR
|userdata|web|VARCHAR
|userdata|zip|INT

Which version of sqlite do you have ?

I have:

SQLite 3.29.0 2019-07-10 17:32:03 fc82b73eaac8b36950e527f12c4b5dc1e147e6f4ad2217ae43ad82882a88alt1
zlib version 1.2.11
gcc-9.2.1 20191008

@jayeshagwan1
Copy link
Author

V 3.30.1.0

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

OK. Then I am at a loss why data type is not returned for you.

@jayeshagwan1
Copy link
Author

I altered script and removed null and now I am getting the desired result:

image

But piicatcher sqlite still not returning data

image

@jayeshagwan1
Copy link
Author

My bad.... Now its working.. Thanks a lot...
image

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019 via email

@jayeshagwan1
Copy link
Author

I have tested piicatcher on different file formats like json, plaintext, xml, csv but does not work with xlsx, xlx

@vrajat
Copy link
Member

vrajat commented Dec 19, 2019

It does not support binary formats like xlsx, xls and PDF yet. Can you file a new feature request ?

@jayeshagwan1
Copy link
Author

Sure ...

@jayeshagwan1
Copy link
Author

Getting error while having deep scan mysql db

image

@vrajat
Copy link
Member

vrajat commented Dec 20, 2019

I've released a new version. Thanks for all the bug reports! Please open a new issue for future problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants