Add company ID to ald_demo #355

vintented · 2021-01-26T14:20:11Z

It would be great to add company IDs (unique numeric sequence) to ald_demo so Asset Resolution can start integrating them into the PACTA for Bank datasets. This addition would ensure that production is always aggregated to the correct entity and help with QA issues and tracking.

Let me know if I can provide any additional details.

maurolepore · 2021-01-26T16:21:43Z

Thanks! @jdhoffa what do you think?

jdhoffa · 2021-01-27T09:10:15Z

I think this is a good call, but have a hunch it may introduce some necessary changes to r2dii.match. In particular, I think that in principle, this ID should replace the id_r2dii ID that we generate in that package. HOWEVER, if for some reason the IDs provided by AR turn out to be corrupt, it could break r2dii.match, so maybe we could do a check in the beginning of match_name(), use the AR IDs if they satisfy our uniqueness criteria (unique by company name + sector), otherwise use id_r2dii.

In any case, I think we should start with a PR here, introducing the new column, and leave this PR open while we test the effects on downstream packages. It should be a concerted effort, so let's flag this in the next PACTA dev prioritization session, and figure out when we'll have time to do it.

vintented · 2021-01-27T09:26:59Z

@jdhoffa to confirm the company IDs are unique. @tposey28 I thought adding IDs to the PACTA for Banks dataset would be a useful addition :)

maurolepore · 2021-01-27T13:10:07Z

Can you explain a bit what the company ID are and where they come from? Can you show an example?

This addition would ensure that production is always aggregated to the correct entity and help with QA issues and tracking.

@jdhoffa, would it be safer to move backwards? is there a way to map the matched output back to the company IDs so that we can get the benefits of the IDs in r2dii.analysis without changing upstream code?

vintented · 2021-02-02T15:40:50Z

@maurolepore sorry about the slow response. It has been a busy month, already. The company IDs are generated in Asset Resolution database and are unique to each entity. If needed, @tposey28 can provide additional details.

Here are some examples:
Company ID | Company Name
478460 | Interoil Argentina As
931 | Boeing Co/The

maurolepore · 2021-02-02T15:58:22Z

Thanks @vintented

I'm I right in understanding that the ald_demo you with would look like this?:

devtools::load_all()
#> Loading r2dii.match

ald_wish <- fake_ald(
  id_ar = c(478460, 931), 
  name_company = tolower(c("Interoil Argentina As", "Boeing Co/The"))
)

ald_wish
#> # A tibble: 2 x 4
#>   name_company          sector alias_ald                id_ar
#>   <chr>                 <chr>  <chr>                    <dbl>
#> 1 interoil argentina as power  alpineknitsindiapvt ltd 478460
#> 2 boeing co/the         power  alpineknitsindiapvt ltd    931

^{Created on 2021-02-02 by the reprex package (v0.3.0)}

(Here id_ar means id_<asset resolution> but could be any other meaningful name.)

jdhoffa · 2021-02-02T17:13:07Z

The company IDs are generated in Asset Resolution database and are unique to each entity.

Are they unique to each entity + sector combination? ie. Interoil Argentina As would likely have the same id, regardless if we're talking about the power or gas sector? (This is fine, it's just something we need to think about when we work to fix match_name to allow us to use it.)

vintented · 2021-02-02T17:34:41Z

@jdhoffa they are unique to company regardless of the sector, or in other words, consistent across sectors.

maurolepore · 2021-02-02T17:47:15Z

This addition would ensure that production is always aggregated to the correct entity and help with QA issues and tracking.

@jdhoffa, am I right in thinking that the benefit that @vintented wants is at the level of r2dii.analysis -- not further upstream?

To change things upstream we can, but do we have to? Or can we first get the benefits in a safer way then roll the solution deeper into the dependency tree?

tposey28 · 2021-02-02T18:11:06Z

Hey jumping in here quickly!
They are unique, but not by sector. Sometimes I use a hacky concatenation of the AR ID and '-sector' to get a unique ID at that level, but the IDs are unique for the entity at a company level.

The IDs are kept consistent by comparing the Bloomberg IDs and LEIs if there is financial data, if not then Global Data ID if available, if not then unique simplified name and country combination. We match run this logic against every new quarter of data. Of course an old company may slip in with a new ID due to it failing at all of these steps, but then Vincent and I will often go through and reconcile these with the old IDs (the important noticeable ones at least) by looking for old IDs that lost production and new IDs that gained production.

The main benefit for this is if Global Data or Bloomberg changes a name, but not their ID, we will update the name without changing the AR ID. This is useful for clients who may otherwise argue that a company disappeared.

jdhoffa · 2021-02-04T08:26:47Z

@maurolepore I think all of this is fine:

Add the new column to ald_demo (and ensure it doesn't break any downstream tests, I don't think it will do anything as is tbh)
See if we can use of the new ID column in match_name() and see if it can replace id_r2dii (likely in concert with sector using group_indices()). Just adding the column to ald_demo won't output anything new/ useful in match_name() I don't think.

I will open the first point as a draft PR today, and let's see if anything breaks and go from there.

vintented · 2021-02-04T16:09:36Z

@jdhoffa and @maurolepore super exciting! You would be surprised how excited people are about company IDs. Let me know if I can help in anyway :)

georgeharris2deg · 2021-02-09T15:11:27Z

@jdhoffa @maurolepore @tposey28 @vintented @daisy-pacheco @Lauramirez-2ii

This is needed for some open engagement with emerging market banks.
Not sure of the prioritisation process but just wanted to bump this up this list.
Happy to discuss

Thanks,

georgeharris2deg · 2021-02-09T16:37:50Z

Reconsidering the above comment and noting the conversation form PACTA - AR call

This solution of adding Unique IDs to each AR data release for Banks and the subsequent code changes that this will require is no longer a top priority. This would be good to have for March 2021 and will help a bank to preserve there matches from a previous matching exercise, year on year.

A short term solution for a bank wanting to match now using the old (q4 2019) data ALD is to proceed with matching and then use an excel bridging file to manually carry over the old matches to the new data set q4 2020.
NB - this is the solution for open emerging markets banks

thanks all!

jdhoffa · 2021-02-22T14:17:47Z

I moved this issue to r2dii.match, as the remainder of the fix will be in actually allowing the user to implement the id_company in the matching process.

jdhoffa · 2021-07-05T08:18:14Z

Closing in favour of #375

jdhoffa transferred this issue from RMI-PACTA/r2dii.data Feb 22, 2021

jdhoffa mentioned this issue Jul 5, 2021

feat: replace id_r2dii with company_id input from ald_demo #375

Open

jdhoffa closed this as completed Jul 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add company ID to ald_demo #355

Add company ID to ald_demo #355

vintented commented Jan 26, 2021

maurolepore commented Jan 26, 2021

jdhoffa commented Jan 27, 2021

vintented commented Jan 27, 2021

maurolepore commented Jan 27, 2021

vintented commented Feb 2, 2021

maurolepore commented Feb 2, 2021 •

edited

Loading

jdhoffa commented Feb 2, 2021

vintented commented Feb 2, 2021

maurolepore commented Feb 2, 2021

tposey28 commented Feb 2, 2021 •

edited

Loading

jdhoffa commented Feb 4, 2021 •

edited

Loading

vintented commented Feb 4, 2021

georgeharris2deg commented Feb 9, 2021

georgeharris2deg commented Feb 9, 2021

jdhoffa commented Feb 22, 2021

jdhoffa commented Jul 5, 2021

Add company ID to ald_demo #355

Add company ID to ald_demo #355

Comments

vintented commented Jan 26, 2021

maurolepore commented Jan 26, 2021

jdhoffa commented Jan 27, 2021

vintented commented Jan 27, 2021

maurolepore commented Jan 27, 2021

vintented commented Feb 2, 2021

maurolepore commented Feb 2, 2021 • edited Loading

jdhoffa commented Feb 2, 2021

vintented commented Feb 2, 2021

maurolepore commented Feb 2, 2021

tposey28 commented Feb 2, 2021 • edited Loading

jdhoffa commented Feb 4, 2021 • edited Loading

vintented commented Feb 4, 2021

georgeharris2deg commented Feb 9, 2021

georgeharris2deg commented Feb 9, 2021

jdhoffa commented Feb 22, 2021

jdhoffa commented Jul 5, 2021

maurolepore commented Feb 2, 2021 •

edited

Loading

tposey28 commented Feb 2, 2021 •

edited

Loading

jdhoffa commented Feb 4, 2021 •

edited

Loading