New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to join cursors by specific columns(PK/UK) #453
Comments
Currently the cursor comparison is done as comparison of list data (order of rows matters). Implement new methods:
The main difference between the two is that when using
When joining with
Snippets for PoC Existing solution - joining by row_no and comparing whole row - simplified example (without dbms_xml) with expected_cursor as (select * from user_objects order by 1 desc),
actual_cursor as (select * from user_objects union all select * from user_objects where rownum = 1 order by 1 asc)
select nvl(exp.row_no, act.row_no) diff_rows
from (select column_value row_data, rownum as row_no
from table( xmlsequence( extract( XMLTYPE(cursor( select * from expected_cursor) ),'ROWSET/*') ) ) ucd
) exp
full outer join
(select column_value row_data, rownum as row_no
from table( xmlsequence( extract( XMLTYPE(cursor( select * from actual_cursor) ),'ROWSET/*') ) ) ucd
) act
on (exp.row_no = act.row_no)
where nvl(dbms_lob.compare(xmlserialize( content exp.row_data no indent), xmlserialize( content act.row_data no indent)),1) != 0; New style - joining by whole row data - simplified example (without dbms_xml) with expected_cursor as (select * from user_objects order by 1 desc),
actual_cursor as (select * from user_objects union all select * from user_objects where rownum = 1 order by 1 asc)
select coalesce(exp.row_hash,act.row_hash) row_hash,
coalesce(exp.duplicate_no, act.duplicate_no) duplicate_no,
case when exp.row_hash is null then 'extra row' else 'missing row' end diff_type
from (select ucd.*, row_number() over(partition by row_hash order by row_hash) duplicate_no
from (select ucd.column_value row_data,
dbms_crypto.hash( value(ucd).getclobval(),3/*HASH_SH1*/) row_hash
from table( xmlsequence( extract( XMLTYPE(cursor( select * from expected_cursor) ),'ROWSET/*') ) ) ucd
) ucd
) exp
full outer join
(select ucd.*, row_number() over(partition by row_hash order by row_hash) duplicate_no
from (select ucd.column_value row_data,
dbms_crypto.hash( value(ucd).getclobval(),3/*HASH_SH1*/) row_hash
from table( xmlsequence( extract( XMLTYPE(cursor( select * from actual_cursor) ),'ROWSET/*') ) ) ucd
) ucd
) act
on exp.row_hash = act.row_hash
and exp.duplicate_no = act.duplicate_no
where exp.row_hash is null or act.row_hash is null; New style query comparing with PK - simplified example (without dbms_xml) with expected_cursor as
(select 4 id, 'd' value from dual union all
select 2 id, 'b' value from dual union all
select 3 id, 'b' value from dual union all
select 3 id, 'c' value from dual union all
select 1 id, 'a' value from dual
),
actual_cursor as
(select 1 id, 'a' value from dual union all
select 2 id, 'b' value from dual union all
select 3 id, 'c' value from dual union all
select 5 id, 'x' value from dual union all
select 4 id, 'w' value from dual
)
select coalesce(exp.row_hash,act.row_hash) row_hash,
coalesce(exp.pk_duplicate_no, act.pk_duplicate_no) pk_duplicate_no,
case
when act.row_hash is null then 'missing row'
when exp.row_hash is null then 'extra row'
else 'diff row'
end diff_type,
exp.row_data.getClobVal() expected_data,
act.row_data.getClobVal() actual_data
from (select ucd.*, row_number() over(partition by pk_hash order by row_hash) pk_duplicate_no
from (select ucd.column_value row_data,
dbms_crypto.hash( value(ucd).getclobval(),3/*HASH_SH1*/) row_hash,
dbms_crypto.hash( extract(value(ucd),'ROW/ID').getClobVal(),3/*HASH_SH1*/) pk_hash
from table( xmlsequence( extract( XMLTYPE(cursor( select * from expected_cursor) ),'ROWSET/*') ) ) ucd
) ucd
) exp
full outer join
(select ucd.*, row_number() over(partition by pk_hash order by row_hash) pk_duplicate_no
from (select ucd.column_value row_data,
dbms_crypto.hash( value(ucd).getclobval(),3/*HASH_SH1*/) row_hash,
dbms_crypto.hash( extract(value(ucd),'ROW/ID').getClobVal(),3/*HASH_SH1*/) pk_hash
from table( xmlsequence( extract( XMLTYPE(cursor( select * from actual_cursor) ),'ROWSET/*') ) ) ucd
) ucd
) act
on exp.pk_hash = act.pk_hash
and exp.pk_duplicate_no = act.pk_duplicate_no
where exp.row_hash is null or act.row_hash is null or exp.row_hash != act.row_hash;
|
Work in Progress |
Ability to join cursors by specific columns(PK/UK)
This feature could/should be implemented in a similar fashion as excluding columns with
a_exclude
parameter.We would have additional parameter for the
ut_equal
matcher if the matcher is used with refcursor.If the parameter is present, we can transform it into XPATH to extract PK/UK/join specific column values from the data and store them in a separate VARCHAR2 column of the temp table for cursor data (with limit of 4000 bytes).
There are few ways of extracting column data from XML - XSL
https://stackoverflow.com/questions/17314062/how-to-convert-xml-to-csv-using-xsl
or
Extract with XPath:
The temp table keys column could be indexed with a unique index, so that we can actually validate that the join columns are unique.
If columns are not unique, the expectation should fail with meaningful message informing that the cursor data contains duplicates.
We need to be able to inform the user which of the compared cursors doesn't have uniqueness which is again tricky as the data_value are not aware of where they were created from (actual or expected).
The text was updated successfully, but these errors were encountered: