-
Notifications
You must be signed in to change notification settings - Fork 13
added MSSQL functionality and writing the resulting table to BigQuery in the example #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
had to add the BaseWrapper to the possible wrappers
… pictures and other types of objects to be imported
Thanks @jac2130 for contributing to this repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added few comments
1. pip install git+git@github.com:MediaAgility/pysql-beam.git | ||
or | ||
2. pip installl pysql-beam | ||
pip install git+git@github.com:jac2130/pysql-beam.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be git@github.com:yesdeepakverma/pysql-beam.git
""" | ||
import sys | ||
import json | ||
sys.path.insert(0, '/home/jupyter/shapiro-johannes/pysql-beam/pysql-beam') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need these 2 lines ?
# parser.add_value_provider_argument('--host', dest='host', required=False) | ||
# parser.add_value_provider_argument('--port', dest='port', required=False) | ||
# parser.add_value_provider_argument('--database', dest='database', required=False) | ||
# parser.add_value_provider_argument('--table', dest='table', required=False) | ||
# parser.add_value_provider_argument('--query', dest='query', required=False) | ||
# parser.add_value_provider_argument('--username', dest='username', required=False) | ||
# parser.add_value_provider_argument('--password', dest='password', required=False) | ||
#parser.add_value_provider_argument('--db_type', dest='db_type', default="mssql", required=False, help="the type of database; allowed are 'mssql', 'mysql' and 'postgres'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove these if not required ?
parser.add_value_provider_argument('--host', dest='host', default="localhost") | ||
parser.add_value_provider_argument('--port', dest='port', default="3306") | ||
parser.add_value_provider_argument('--database', dest='database', default="dverma") | ||
parser.add_value_provider_argument('--query', dest='query', default="SELECT * FROM dverma.userPointsLedger;") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be remove dverma database value from here and use something which is not user specific!
# 'postgres', options.password, | ||
# options.database, table=options.output_table, | ||
# wrapper=PostgresWrapper, autocommit=False, batch_size=500) | ||
#mysql_data | "Log records " >> beam.Map(log) | beam.io.WriteToText(options.output, num_shards=1, file_name_suffix=".json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove any unnecessary code
@@ -0,0 +1,68 @@ | |||
SELECT DATEFROMPARTS(YEAR([MRRECVH].RECV_DATE),MONTH([MRRECVH].RECV_DATE), DAY([MRRECVH].RECV_DATE)) as 'Date', [MRRECVH].RECV_DATE, [MRRECVD].CONTROL, SUBSTRING([MRRECVD].CONTROL, 9, 2) as 'ADJ', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this file ?
@@ -0,0 +1,3 @@ | |||
SELECT COMPANY, DIVISION, MARKET, COMMODITY, EFFECTIVE_DATE, ISSUE_DATE, DESCRIPTION, LOW_PRICE, HIGH_PRICE, PRICE_UM, CREATEDATE, CREATETIME, CREATEUSER, CREATESTATION, LASTDATE, LASTTIME, LASTUSER, LASTSTATION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this file? This might be something specific to your personal or professional work.
sys.path.insert(0, '/home/jupyter/shapiro-johannes/pysql-beam/pysql-beam') | ||
sys.path.insert(0, '/home/jupyter/shapiro-johannes/pysql-beam') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might not need these 2 lines
print('port', self.port) | ||
print('database', self.database) | ||
print('username', self.username) | ||
print('password', self.password) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not log user name and password, sometimes these being end up logged in log files
@jac2130 , Can you resolve the README conflict. |
I want to thank you for building such a cool library and I find that it is easily extensible; I'd love to clean some things up as soon as I have some time, in particular I found a few things that I want to improve. For example in
wrapper.py
on line 501 I find that at some point sometimes at this place in the code the BaseWrapper.paginated_query has already been called and thus there are "LIMIT" and "OFFSET" sub-strings already present ...I don't know why? Aside from that, much of the same code is repeated in various places which makes it confusing to read and also one has to fix the same thing in many places if there needs to be a fix or addition. Aside from this, this is some great code and it helps me out of a bind; thank you!