Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest/mssql): load jobs and stored procedures #5363

Merged

Conversation

RChygir
Copy link
Contributor

@RChygir RChygir commented Jul 8, 2022

  • mssql jobs -> datahub dataFlow
  • mssql job step -> datahub dataJob
  • stored procedures -> datahub dataJobs

Co-Authored-By: Oleksandr aleksandr.lytvynov@gmail.com

Summary

We add loading dataFlow and dataJob entities. We add configuration flags for separated work with entities.

Changes

  • Add MSSQL Job as dataFlow;
  • Add MSSQL Job Step as dataJob with properties:
Property Example
command EXEC [DataBaseName].[SchemaName].[ProcedureName]
date_created 1990-01-01 00:00:00.000000
date_modified 1990-01-01 00:00:00.000000
description Full description
job_id 98765432-1987-6543-2198-7654321987654
job_name JobName
step_id 1
step_name StepName
subsystem TSQL
  • Add Stored Procedure as dataJob with properties;
Property Example
code CREATE PROCEDURE [SchemaName].[ProcedureName] AS BEGIN SET NOCOUNT ON; END
create_date 1990-01-01 00:00:00.000000
definition Full description
depending_on_procedure {}
input parameters []
modify_date 1990-01-01 00:00:00.000000
procedure_depends_on {'SchemaName.rc.TableName': 'USER_TABLE'}
  • Stored procedures pack into dataFlow for StoredProcedures;
  • Add configuration flags:
    • include_stored_procedures - for load stored procedures as dataJob;
    • include_code - for load stored procedure code as property;
    • include_jobs - for load MSSQL Jobs as dataFlow and Steps as dataJob;
    • include_descriptions - for load tables and table columns descriptions.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@maggiehays maggiehays added community-contribution PR or Issue raised by member(s) of DataHub Community ingestion PR or Issue related to the ingestion of metadata labels Jul 8, 2022
@github-actions
Copy link

github-actions bot commented Jul 8, 2022

Unit Test Results (build & test)

621 tests  ±0   617 ✔️ ±0   15m 46s ⏱️ +5s
157 suites ±0       4 💤 ±0 
157 files   ±0       0 ±0 

Results for commit 8f11328. ± Comparison against base commit 817406e.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Jul 8, 2022

Unit Test Results (metadata ingestion)

       8 files  ±0         8 suites  ±0   58m 51s ⏱️ + 1m 26s
   759 tests ±0     753 ✔️  - 3  3 💤 ±0  3 +3 
1 520 runs  ±0  1 507 ✔️  - 6  7 💤 ±0  6 +6 

For more details on these failures, see this check.

Results for commit 8f11328. ± Comparison against base commit 817406e.

♻️ This comment has been updated with latest results.

@RChygir RChygir requested a review from matheuzzy August 3, 2022 07:25
@jjoyce0510 jjoyce0510 requested review from mayurinehate and removed request for matheuzzy November 16, 2022 21:42
@jjoyce0510
Copy link
Collaborator

@mayurinehate Please have a look. Let's aim to get this one in!

@RChygir
Copy link
Contributor Author

RChygir commented Nov 17, 2022

Updated source.

@RChygir RChygir requested review from matheuzzy and removed request for mayurinehate November 18, 2022 07:47
@mayurinehate
Copy link
Collaborator

mayurinehate commented Nov 18, 2022

Hey @RChygir thank you for the PR.

Mssql tests are failing in CI. Can you please fix it ?

Also it would be great if you can add more tests for MSSQL Jobs as Dataflow and MSSQL Job steps as DataJob.

@hsheth2 hsheth2 assigned mayurinehate and unassigned rslanka Dec 2, 2022
@mayurinehate
Copy link
Collaborator

Hi again @RChygir !! did you chance to see my comment above ?

Let me know if you need any help or have any doubts.

@jjoyce0510
Copy link
Collaborator

Hi folks! If we do not hear back on this PR soon, we will move to close due to inactivity.

Cheers
John

@jjoyce0510 jjoyce0510 added the abandoned PRs that have been closed due to being abandoned label Dec 22, 2022
@RChygir
Copy link
Contributor Author

RChygir commented Dec 23, 2022

I apologize for the delay in responding to your comments. I will try to correct my mistakes as soon as possible. @mayurinehate Thanks for your help

@RChygir RChygir force-pushed the feature/chyhir_lytvynov_mssql_extend branch 2 times, most recently from dc8791b to 3c9c9fc Compare January 10, 2023 10:07
@RChygir RChygir force-pushed the feature/chyhir_lytvynov_mssql_extend branch from cff30dd to 848b3a9 Compare January 16, 2023 08:23
@mayurinehate
Copy link
Collaborator

@RChygir Is it possible to add integration tests related to jobs and procedures integration ? I plan to dwelve into MSSQL side of things next, Tests would greatly help. Existing integration tests are here - https://github.com/datahub-project/datahub/tree/master/metadata-ingestion/tests/integration/sql_server

@RChygir
Copy link
Contributor Author

RChygir commented Jan 18, 2023

Tests updated.

@RChygir RChygir force-pushed the feature/chyhir_lytvynov_mssql_extend branch from 16eaf8e to de94ccf Compare January 18, 2023 10:32
Copy link
Collaborator

@mayurinehate mayurinehate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine overall. Left some comments about URN construction and unused code, which need to be addressed.

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

```
USE MSDB
GRANT SELECT ON OBJECT::msdb.dbo.sysjobsteps TO 'USERNAME'
GRANT SELECT ON OBJECT::msdb.dbo.sysjobs TO 'USERNAME'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@hsheth2 hsheth2 added the merge-pending-ci A PR that has passed review and should be merged once CI is green. label Aug 15, 2023
@DmytroYurchuk
Copy link

Can somebody help with these failed tests?

@anshbansal anshbansal merged commit 43d48dd into datahub-project:master Aug 24, 2023
24 of 25 checks passed
@anshbansal
Copy link
Collaborator

cypress test failures look unrelated so merging this in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution PR or Issue raised by member(s) of DataHub Community ingestion PR or Issue related to the ingestion of metadata merge-pending-ci A PR that has passed review and should be merged once CI is green.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants