tango-python SDK / Tango API — follow-up feedback from a different use case
Companion to the
prior issue.
This round is from integrating Tango alongside USASpending and SAM.gov
as enrichment sources in a continuously-deployed pipeline. None of the
items below block the work — we have workarounds for everything — but
flagging them in case any are useful for documentation or backlog.
What's working great in this use case
rate_limit_info on the client is accurate and accessible. We rely
on it for pacing decisions across multiple workflows that share a
medium-tier 7,500/day quota.
- Schema-stable
dataclass responses make response handling
pleasant compared to the loose-shape USASpending API.
- Resumability via deterministic
key values lets us write
resumable fetchers cleanly. Each workflow's output CSV carries a
lookup_status column; PIIDs already in found / not_found state
are skipped on re-run; error rows are retried. The Tango key pattern
is what makes this safe.
list_vehicles(search="…") plus list_vehicle_awardees(uuid=…)
surfaces program-level vehicle families (CIO-SP3, OASIS+, Alliant 2,
EIS, SEWP, …) that USASpending's public API doesn't name at all.
This is the headline reason Tango is in our pipeline.
Friction we hit (ranked by impact on this pipeline)
1. Vehicle ↔ IDV linkage requires two calls and a client-side exact-match check
Use case: for every parent IDV PIID in our dataset (~7,800), look up
the program-level Vehicle it belongs to.
The shortest path we found:
# Call 1: resolve PIID → solicitation_identifier
idv = client.list_idvs(piid=parent_piid, limit=1,
shape="piid,competition(*)")
sol = idv.results[0]["competition"]["solicitation_identifier"]
# Call 2: search Vehicles by that solicitation_identifier
veh = client.list_vehicles(search=sol, limit=5,
shape="uuid,program_acronym,vehicle_type,"
"solicitation_identifier,idv_count,"
"awardee_count,order_count,total_obligated")
# Client-side filter — `search` is keyword-style, not exact
match = next((v for v in veh.results
if v["solicitation_identifier"] == sol), None)
That's 2 API calls per PIID and a client-side exact-match check —
because (a) list_idvs doesn't have a vehicle(…) shape selector and
(b) list_vehicles accepts no exact solicitation_identifier or
piid filter. At our parent-PIID count this doubles our backfill cost
(~15,500 calls instead of ~7,800).
Possible fix: either expose
list_vehicles(solicitation_identifier=…, exact=True) /
list_vehicles(piid=…) as a server-side filter, or let list_idvs
accept a vehicle(uuid,program_acronym,…) shape that joins the
Vehicle in-server. Either would collapse this to 1 call/PIID and
eliminate the client-side exact-match step.
2. Entity.immediate_owner and highest_owner lack uei
Use case: we maintain a manually-curated corporate-family override
file (141 rows / 37 canonical-parent groups — for example Lockheed
Martin has 15 distinct UEIs in USASpending bulk data that need to
roll up into one canonical parent_uei). We evaluated using Tango's
owner fields to auto-derive these from SAM.
Repro: shape any entity with star-expansion on the owner fields,
including a known subsidiary like General Dynamics Land Systems
(HAWKSQF848W7):
client.list_entities(uei="HAWKSQF848W7", limit=1,
shape="legal_business_name,"
"immediate_owner(*),highest_owner(*)")
Owner responses come back as:
{"cage_code": "8JFT1",
"legal_business_name": "GENERAL DYNAMICS LAND SYSTEMS, GLOBAL LLC"}
{"cage_code": "95403",
"legal_business_name": "GENERAL DYNAMICS CORP"}
The owner is identified by NAME and CAGE — uei is not in the
returned payload. Across the 5 known subsidiaries we probed (all
expected to roll up to General Dynamics), 4 came back with owner
records carrying only cage_code + legal_business_name; the 5th
had no owner record at all. Without a UEI on the owner side we can't
join subsidiary_uei → parent_uei.
Probable root cause: SAM's underlying registration data populates
parent registrant fields with name + CAGE but not the parent's own UEI
when known. So this may need an upstream SAM API enhancement rather
than a Tango fix — but if Tango maintains its own Entity → Entity
index and could backfill the UEI when the legal_business_name resolves
to a known Entity in the registry, that'd be the unlock.
Possible fix: when the owner's legal_business_name + cage_code
matches a registered Entity, surface that Entity's uei on the owner
record.
3. Vehicle program_acronym is often null
In a small initial probe (5 parent PIIDs that matched a Vehicle), 2
came back with program_acronym=None — both were agency-specific
BPA/BOA-style vehicles. They still get vehicle_type and the rollup
counts (idv_count, order_count, etc.), which is enough for
analytical aggregation, but we lose the friendly display label for
those rows.
Not really a bug — the underlying SAM / FPDS data legitimately doesn't
have a name for these — but for consumers building a "By Vehicle"
view it'd help to know in advance which vehicle classes tend to lack
a program_acronym, so we can plan a fallback (we ended up rendering
(unnamed <type>) as the display label).
Possible fix on the docs side: a one-line note on
VEHICLE_SCHEMA.program_acronym listing the vehicle classes that
typically have it populated (GWAC, large multi-agency IDCs) vs
typically null (agency-specific BPAs, BOAs, single-award IDCs).
4. rate_limit_info exposes daily totals but not per-endpoint utilization
For a multi-workflow account where several refresh jobs share quota,
it'd be useful to know which endpoints are burning quota fastest
without instrumenting application-side counters. Tier-up territory
for us, but flagging in case the response is easy to extend.
Possible fix: add endpoint_calls_today: {endpoint_name: count, …}
or similar to the rate-limit response payload.
Wishlist (small things, no concrete bug)
- A
list_vehicles(piid=parent_idv_piid) or (solicitation_identifier=…, exact=True) filter — biggest single ergonomic win for the use case here (collapses item 1 from two calls to one).
- A docs page or schema annotation flagging which fields tend to be null by vehicle/contract class (item 3).
- UEI on
immediate_owner / highest_owner when resolvable (item 2).
What Tango uniquely provided here
The program-level vehicle layer (program_acronym, vehicle_type,
and the per-vehicle rollup metrics under VEHICLE_SCHEMA) is the
piece with no public alternative we could find — USASpending exposes
parent IDV PIIDs but doesn't name the program-level family those IDVs
belong to. That's what motivated the integration.
tango-pythonSDK / Tango API — follow-up feedback from a different use caseCompanion to the
prior issue.
This round is from integrating Tango alongside USASpending and SAM.gov
as enrichment sources in a continuously-deployed pipeline. None of the
items below block the work — we have workarounds for everything — but
flagging them in case any are useful for documentation or backlog.
What's working great in this use case
rate_limit_infoon the client is accurate and accessible. We relyon it for pacing decisions across multiple workflows that share a
medium-tier 7,500/day quota.
dataclassresponses make response handlingpleasant compared to the loose-shape USASpending API.
keyvalues lets us writeresumable fetchers cleanly. Each workflow's output CSV carries a
lookup_statuscolumn; PIIDs already infound/not_foundstateare skipped on re-run; error rows are retried. The Tango key pattern
is what makes this safe.
list_vehicles(search="…")pluslist_vehicle_awardees(uuid=…)surfaces program-level vehicle families (CIO-SP3, OASIS+, Alliant 2,
EIS, SEWP, …) that USASpending's public API doesn't name at all.
This is the headline reason Tango is in our pipeline.
Friction we hit (ranked by impact on this pipeline)
1. Vehicle ↔ IDV linkage requires two calls and a client-side exact-match check
Use case: for every parent IDV PIID in our dataset (~7,800), look up
the program-level Vehicle it belongs to.
The shortest path we found:
That's 2 API calls per PIID and a client-side exact-match check —
because (a)
list_idvsdoesn't have avehicle(…)shape selector and(b)
list_vehiclesaccepts no exactsolicitation_identifierorpiidfilter. At our parent-PIID count this doubles our backfill cost(~15,500 calls instead of ~7,800).
Possible fix: either expose
list_vehicles(solicitation_identifier=…, exact=True)/list_vehicles(piid=…)as a server-side filter, or letlist_idvsaccept a
vehicle(uuid,program_acronym,…)shape that joins theVehicle in-server. Either would collapse this to 1 call/PIID and
eliminate the client-side exact-match step.
2.
Entity.immediate_ownerandhighest_ownerlackueiUse case: we maintain a manually-curated corporate-family override
file (141 rows / 37 canonical-parent groups — for example Lockheed
Martin has 15 distinct UEIs in USASpending bulk data that need to
roll up into one canonical parent_uei). We evaluated using Tango's
owner fields to auto-derive these from SAM.
Repro: shape any entity with star-expansion on the owner fields,
including a known subsidiary like General Dynamics Land Systems
(
HAWKSQF848W7):Owner responses come back as:
The owner is identified by NAME and CAGE —
ueiis not in thereturned payload. Across the 5 known subsidiaries we probed (all
expected to roll up to General Dynamics), 4 came back with owner
records carrying only
cage_code+legal_business_name; the 5thhad no owner record at all. Without a UEI on the owner side we can't
join
subsidiary_uei → parent_uei.Probable root cause: SAM's underlying registration data populates
parent registrant fields with name + CAGE but not the parent's own UEI
when known. So this may need an upstream SAM API enhancement rather
than a Tango fix — but if Tango maintains its own Entity → Entity
index and could backfill the UEI when the legal_business_name resolves
to a known Entity in the registry, that'd be the unlock.
Possible fix: when the owner's
legal_business_name+cage_codematches a registered Entity, surface that Entity's
ueion the ownerrecord.
3. Vehicle
program_acronymis often nullIn a small initial probe (5 parent PIIDs that matched a Vehicle), 2
came back with
program_acronym=None— both were agency-specificBPA/BOA-style vehicles. They still get
vehicle_typeand the rollupcounts (
idv_count,order_count, etc.), which is enough foranalytical aggregation, but we lose the friendly display label for
those rows.
Not really a bug — the underlying SAM / FPDS data legitimately doesn't
have a name for these — but for consumers building a "By Vehicle"
view it'd help to know in advance which vehicle classes tend to lack
a
program_acronym, so we can plan a fallback (we ended up rendering(unnamed <type>)as the display label).Possible fix on the docs side: a one-line note on
VEHICLE_SCHEMA.program_acronymlisting the vehicle classes thattypically have it populated (
GWAC, large multi-agency IDCs) vstypically null (agency-specific BPAs, BOAs, single-award IDCs).
4.
rate_limit_infoexposes daily totals but not per-endpoint utilizationFor a multi-workflow account where several refresh jobs share quota,
it'd be useful to know which endpoints are burning quota fastest
without instrumenting application-side counters. Tier-up territory
for us, but flagging in case the response is easy to extend.
Possible fix: add
endpoint_calls_today: {endpoint_name: count, …}or similar to the rate-limit response payload.
Wishlist (small things, no concrete bug)
list_vehicles(piid=parent_idv_piid)or(solicitation_identifier=…, exact=True)filter — biggest single ergonomic win for the use case here (collapses item 1 from two calls to one).immediate_owner/highest_ownerwhen resolvable (item 2).What Tango uniquely provided here
The program-level vehicle layer (
program_acronym,vehicle_type,and the per-vehicle rollup metrics under
VEHICLE_SCHEMA) is thepiece with no public alternative we could find — USASpending exposes
parent IDV PIIDs but doesn't name the program-level family those IDVs
belong to. That's what motivated the integration.