In [2]:
import calitp.magics

The Littlepay docs say that the `adjustment_id` field is a unique identifier. Let's verify that:

In [3]:
%%sql

select adjustment_id, count(*)
from payments.stg_enriched_micropayment_adjustments
group by 1
having count(*) > 1
order by 2 desc

Unnamed: 0,adjustment_id,f0_
0,58cd50ba-55d2-4a76-a5ad-ca981f81aa1f,87
1,0c8b761c-2570-4fe7-a6d4-5c97fabf09d0,67
2,1713d5bb-61b9-46e9-9ace-edb74775cc78,66
3,e05855c1-a820-4d6b-beef-ad325f82f6ea,66
4,47e3e78c-cbba-44ba-a480-a6ff38b66a5f,62
...,...,...
11655,f2079555-f1a5-418a-af5c-f3fda60a746e,2
11656,f4b563e9-956d-4f48-98ab-0621f009693b,2
11657,f69231bb-79e5-42e4-8f49-8bc4db214e1d,2
11658,fb187d3a-f9cc-43a8-9952-0d39946b6883,2


Welp, that doesn't seem true. Let's take the first one and see what micropayments are associated with it:

In [13]:
%%sql

select a.type, m.*
from payments.stg_enriched_micropayment_adjustments a
join payments.stg_cleaned_micropayments m using (micropayment_id)
where adjustment_id = '58cd50ba-55d2-4a76-a5ad-ca981f81aa1f'
and applied is true
order by transaction_time

Unnamed: 0,type,micropayment_id,aggregation_id,participant_id,customer_id,funding_source_vault_id,transaction_time,payment_liability,charge_amount,nominal_amount,currency_code,type_1,charge_type,calitp_extracted_at,calitp_hash,calitp_export_account,calitp_export_datetime
0,MULTI_DAY_CAP,f4a13065-56bb-433b-a61d-0e6f79380d1a,f8721b72-40e2-4904-ac68-b2fc4d7a564b,mst,3f9e3e46-b33f-4962-aaad-21fab46e2769,bdb897cc-1204-4d19-a6db-3f15e5497b6d,2021-07-27 00:47:13+00:00,OPERATOR,0.5,2.5,840,DEBIT,complete_variable_fare,2021-11-06,kT786JTbBSnseRnCiijOYw==,,
1,MULTI_DAY_CAP,deddaa44-6421-469b-bbaa-5194dd574c57,b2677f0f-242c-49c7-8bef-e52f283db102,mst,3f9e3e46-b33f-4962-aaad-21fab46e2769,bdb897cc-1204-4d19-a6db-3f15e5497b6d,2021-07-27 14:05:19+00:00,OPERATOR,0.0,2.5,840,DEBIT,complete_variable_fare,2021-11-06,Px8khghJQkwKCQxqky0Qhg==,,
2,MULTI_DAY_CAP,215de51a-2e0b-4b3f-ac9f-a02357242b1c,3b2fc351-9080-4418-ae3b-35cb37963360,mst,3f9e3e46-b33f-4962-aaad-21fab46e2769,bdb897cc-1204-4d19-a6db-3f15e5497b6d,2021-07-28 14:03:39+00:00,OPERATOR,0.0,2.5,840,DEBIT,complete_variable_fare,2021-11-06,MDjtS5affHx7Zk0JvdlLtA==,,
3,MULTI_DAY_CAP,3f0cf3e5-f370-41cc-b0d8-888e079e8823,3b2fc351-9080-4418-ae3b-35cb37963360,mst,3f9e3e46-b33f-4962-aaad-21fab46e2769,bdb897cc-1204-4d19-a6db-3f15e5497b6d,2021-07-28 14:19:04+00:00,OPERATOR,0.0,1.5,840,DEBIT,complete_variable_fare,2021-11-06,vUp2+/3lStUBg9EOQUNb8A==,,
4,MULTI_DAY_CAP,27fde697-0da7-4d6b-8b81-fbb1638f6d31,3b2fc351-9080-4418-ae3b-35cb37963360,mst,3f9e3e46-b33f-4962-aaad-21fab46e2769,bdb897cc-1204-4d19-a6db-3f15e5497b6d,2021-07-29 00:48:20+00:00,OPERATOR,0.0,2.5,840,DEBIT,complete_variable_fare,2021-11-06,02s1e2BTgqSH3DABYGWoFA==,,


So it looks like a rider hit a monthly cap, and then every ride (micropayment) after that had the same adjustment applied to it. That adjustment_id isn't unique _at all_. But at least it makes sense. Let's see how many adjustments correspond to each micropayment.

In [20]:
%%sql

select micropayment_id, count(distinct a.calitp_hash), count(*)
from payments.stg_enriched_micropayment_adjustments a
join payments.stg_cleaned_micropayments m using (micropayment_id)
where applied is true
group by 1
having count(*) > 1
order by 2 desc, 3 desc

Unnamed: 0,micropayment_id,f0_,f1_
0,dd601a49-f2dc-4f7b-816a-4789d25e7a71,1,2
1,1d4fbf0a-eaad-48d6-918c-18b4b64e0d69,1,2
2,2866eaba-fb42-47bb-80a7-614bb50d3805,1,2
3,9e59ca8d-1674-4b64-8b3c-05046da245d2,1,2
4,c1188666-53fe-41fa-b7e9-37b3e7b7ebca,1,2
...,...,...,...
73,db369532-c237-4be6-815f-51604314162d,1,2
74,e0ccff8f-4f71-40fa-9785-cbd559c2e863,1,2
75,7bc7906d-83d4-4084-bf95-4f28a6ea8105,1,2
76,b01e8fa8-30c1-4e15-930e-ed85a4c2b140,1,2


Looks like no more than one calitp hash per micropayment. So, we should just be able to deduplicate based on calitp hash, and then validate that the `micropayment_id`,`adjustment_id` pair is unique.