Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate representation of CrunchBase IPOs #1808

Closed
VladimirAlexiev opened this issue Aug 9, 2022 · 9 comments
Closed

validate representation of CrunchBase IPOs #1808

VladimirAlexiev opened this issue Aug 9, 2022 · 9 comments
Assignees
Labels
FIBO instance data issue about FIBO instance data use case use case related issue Using FIBO how to use FIBO

Comments

@VladimirAlexiev
Copy link

Could you please validate this representation of CrunchBase IPOs in FIBO?
The simplest representation is 1 node, this one in FIBO is 25 nodes:
image

I made a gist that describes the fields and shows the turtle model.

Thanks in advance! We at Ontotext plan to write a blog post about this experience

@mereolog
Copy link
Contributor

mereolog commented Aug 10, 2022

Please find below some quick comments from me - @ElisaKendall may give you a more comprehensive review:

  1. I understand that the current ttl is a kind of stub to be filled with the actual data, e.g., '(money_raised)'^^xsd:decimal will be replaced with the actual value - currently these stubs confuse Protege when it saves the file.
  2. I would make <cb/agent/(org_uuid)/issuer> an instance of Offeror (as well as of Issuer).
  3. <cb/exchange/(stock_exchange_symbol)> as an instance of both Exchange and RegistrationAuthority looks wrong given their definitions.
  4. Strictly speaking, the instances of fibo-fnd-acc-cur:Currency, e.g., <cb/currency/(share_price_currency_code)>, introduce potentially new currencies. I do not think you need them - in the proper version of the ttl file you should reuse the current instances thereof.
  5. We recommend using unique IRI fragments even across naming spaces (see: https://github.com/edmcouncil/fibo/blob/master/ONTOLOGY_GUIDE.md#general-naming-and-labeling-conventions) your instances of Currency do not follow this recommendation.
  6. We also recommend having rdfs:labels for all resources - some of yours do not have them, e.g., <cb/exchange/(stock_exchange_symbol)/code>.

Finally, could you publish in the gist some examples of the actual IPO data?

@mereolog mereolog added FIBO instance data issue about FIBO instance data use case use case related issue Using FIBO how to use FIBO labels Aug 10, 2022
@VladimirAlexiev
Copy link
Author

Thanks @mereolog !

  • 1: Stub used by my rdf2rml tool to generate transformations (R2RML, TARQL CONSTRUCT, OntoRefine UPDATE).
    I understand Protege may get confused because the parenthesized string is not a valid integer, but I hope it can print a warning and still load the data?
  • 2: added fibo-fbc-pas-fpas:Offeror. But then what else should I add to satisfy this?
         rdfs:subClassOf  [ rdf:type            owl:Restriction ;
                           owl:onProperty      fibo-fnd-pty-rl:isPlayedBy ;
                           owl:someValuesFrom  [ rdf:type            owl:Restriction ;
                                                 owl:onProperty      fibo-fnd-pty-pty:isAPartyTo ;
                                                 owl:someValuesFrom  fibo-fbc-pas-fpas:Offering]]
  • 3: I now added <cb/ipo/(uuid)/ticker> a fibo-sec-sec-id:ListedSecurityIdentifier: do you agree? Then there is:
fibo-sec-sec-id:ListedSecurityIdentifier
        rdfs:subClassOf  [ rdf:type            owl:Restriction ;
                           owl:onProperty      fibo-fbc-fct-ra:isRegisteredBy ;
                           owl:someValuesFrom  fibo-fbc-fct-mkt:Exchange] ;
  • The range of fibo-fbc-fct-ra:isRegisteredBy is union ( fibo-fbc-fct-ra:RegistrationAuthority fibo-fbc-fct-ra:Registrar ): which of these 2 classes you think is more appropriate?
  • 4: Yes, it says there Better reconcile to fibo-fnd-acc-4217:ISO4217-CodeSet. But for this to happen:
    • I must check that CB currency codes are all from ISO 4271
    • I must use an UPDATE to link eg "USD" to <ISO4217-CurrencyCodes/USDollar>, cannot use a CSV->RDF transform.
  • 5: do you mean that local names of individuals should be unique? That's not very practical for source data that includes only codes but not names (as are CB currencies)
    • It would be more data-engineer friendly if the Currency individual URLs use codes instead of names, just like CurrencyIdentifier individuals do
  • 6: ok, added
  • 7: added ipos-sample.csv

Updated the model image and the gist:

image

@VladimirAlexiev
Copy link
Author

  • 8 (TODO): Currently I use monetary nodes in "national currency" vs in USD with URLs like this:
<cb/ipo/(uuid)/pricePerShare> vs <cb/ipo/(uuid)/pricePerShareUsd>
  • But many IPOs are of US companies, so we can save on nodes if we use URLs like this.
    The difference is subtle but crucial: it will effectively merge the nodes:
<cb/ipo/(uuid)/pricePerShare/(share_price_currency_code)> vs <cb/ipo/(uuid)/pricePerShare/USD>

@mereolog
Copy link
Contributor

As for:

  • 2: added fibo-fbc-pas-fpas:Offeror. But then what else should I add to satisfy this?
         rdfs:subClassOf  [ rdf:type            owl:Restriction ;
                           owl:onProperty      fibo-fnd-pty-rl:isPlayedBy ;
                           owl:someValuesFrom  [ rdf:type            owl:Restriction ;
                                                 owl:onProperty      fibo-fnd-pty-pty:isAPartyTo ;
                                                 owl:someValuesFrom  fibo-fbc-pas-fpas:Offering]]

These two triples should do the trick:

<cb/agent/(org_uuid)/issuer> fibo-fnd-pty-rl:isPlayedBy <cb/agent/(org_uuid)>.
<cb/agent/(org_uuid)> fibo-fnd-pty-pty:isAPartyTo <cb/ipo/(uuid)/offering>

@mereolog
Copy link
Contributor

As for the rest of your questions, you should contact get in touch with @ElisaKendall .

@ElisaKendall
Copy link
Contributor

@VladimirAlexiev
For 3, for ticker symbol, your individuals should be of type ListedSecurityIdentifier and TickerSymbol (both are in the SecuritiesIdentification ontology). Ticker symbols are reassignable, and are specific to the exchange, though.

Related to that, we don't have a reference ontology with all of the ticker symbols for any exchange (since there are over 7K exchanges), but we do have the latest ISO MIC codes as of the end of June (Q2 2022) - these are either operating level or market segment level codes. The class, MarketIdentifier captures both cases, but the codes themselves are in the FBC/FunctionalEntities/MarketsIndividuals ontology and they require the BusinessCentersIndividuals ontology (which says where they are located per whatever they reported to ISO). The only thing is that you need to know whether it is an operating level or segment level market. For IPOs in the US, for issuance of shares of stock, it is typically either the NYSE (Exchange - fibo-fbc-fct-mkti;Exchange-XNYS, MIC - fibo-fbc-fct-mkti;MIC-XNYS), or the Nasdaq (Exchange - fibo-fbc-fct-mkti;Exchange-XNAS, MIC - fibo-fbc-fct-mkti;MIC-XNAS).

A given IPO will be in a specific market, such as the NYSE or Nasdaq in the US, so for the Crunchbase IPO you should be able to get all of those details from the exchange individual, e.g., fibo-fbc-fct-mkti;Exchange-XNAS, which will even tell you which municipality or in some cases FpML business center that the exchange is located in.

For isRegisteredBy, the difference is whether it is registered by the official RA or by someone that they have delegated that role to. Typically for a security, the exchange is the registration authority. So in the individual you create for the exchange, add a triple saying that it is the registration authority and you are done.

So for currency codes, what about using a property chain for hasCurrency --> isIdentifiedBy? You could then use the resulting code rather than the name for matching? We did that at Wells Fargo for a similar case, no problem.

One other thing - you may be able to map at least some of the details to openFIGI (https://www.openfigi.com/) if you have the FIGI for the security. At least some of their symbology data is publicly available. And, some of the data related to the legal entity should be mappable to Open Corporates (https://opencorporates.com/companies?jurisdiction_code=&q=Crunchbase&utf8=%E2%9C%93), likely the Delaware corporation is the one doing the IPO, but you would have to confirm through the offering statement.

Hope this helps - I'll keep checking back, and thanks @mereolog for your help with the other questions!

Elisa

@VladimirAlexiev
Copy link
Author

Hi @ElisaKendall, thanks for the feedback!

  • 3a: fibo-sec-sec-id:ListedSecurityIdentifier wants fibo-fbc-fct-ra:isRegisteredBy some fibo-fbc-fct-mkt:Exchange.
    • fibo-fbc-fct-ra:isRegisteredBy has range union ( fibo-fbc-fct-ra:RegistrationAuthority fibo-fbc-fct-ra:Registrar )
    • fibo-fbc-fct-mkt:Exchange is a subclass of neither.
    • So as you said, I'll add a second type fibo-fbc-fct-ra:RegistrationAuthority
  • 9: "there are over 7K exchanges... have the latest ISO MIC codes.. operating level or market segment level codes":
    I know, I added https://www.wikidata.org/wiki/Property:P7534 and matched all 1947 values (as of Apr 2020) to WD, see https://www.wikidata.org/wiki/Property_talk:P7534. Is it really 7k values as of today? I sincerely hope not :-)
  • 10: "for the Crunchbase IPO you should be able to get all of those details from the exchange individual".
    • Unfortunately CB Exchanges use ambiguous codes, eg 'mse' could be Metropolitan or Mongolian. But we added them to WD so now we can use WD queries to match between MIC and CB exchanges. Out of 164 CB exchange codes:
    • 1 pair uses the same code: https://w.wiki/5aYS
    • 8 have two codes: https://w.wiki/5aYP (which is not a crime)
    • 18 are not in MIC: https://w.wiki/5aYM (I reduced this to 13)
  • 8: done
  • 11: "what about using a property chain for hasCurrency --> isIdentifiedBy":
    How will I express this in RDF? Prop chains can be used as a shortcut to navigate quickly through nodes. But I need those nodes when making RDF?
  • 12: OpenFigi: well, a search for Tencent Holdings returns 18k securities, which use different (not MIC) exchange codes, so I don't think we'll go there just yet.

Do you see any problems on the relations, or I got them right?

image

BTW this mapping was done on request by @peio.
His comment: "Both impressive and scary" ;-)

@ElisaKendall
Copy link
Contributor

@VladimirAlexiev
For 9 - yes, over 7K individuals, but I think it's really half that for the number of exchanges because we have both the exchanges and the mic codes represented :). The reference data is up to date as of June 2022 now. Currency codes too - we updated those recently.

There are example property chains in the FND/Parties/Parties ontology for a bunch of relationships that we use for linking counterparties up. If you follow the pattern there, you should end up with something like

<owl:ObjectProperty rdf:about="<your prefix>hasCurrencyCode">
	<rdfs:label>has currency code</rdfs:label>
	<owl:propertyChainAxiom rdf:parseType="Collection">
		<rdf:Description rdf:about="&fibo-fnd-acc-cur;hasCurrency">
		</rdf:Description>
		<rdf:Description rdf:about="&lcc-lr;iisIentifiedBy">
		</rdf:Description>
	</owl:propertyChainAxiom>
	<skos:definition>relates something to the relevant currency code for the currency in question</skos:definition>
</owl:ObjectProperty>

We have all of the currencies and currency codes already in FIBO, so you may not need any additional nodes to do this. They are in the ISO 4217 ontology, but I'm assuming that you already knew that.

This is getting much closer to what I would expect. We have some examples in SEC/Equities/EquitiesExampleIndividuals that provide some examples for alphabet and apple shares that might be useful for comparison if you haven't already looked there. They include properties for apple stock on multiple exchanges, and include that the listing is in USD so maybe those examples can provide some additional ideas. Let me know if you have other questions, and good luck!

@VladimirAlexiev
Copy link
Author

VladimirAlexiev commented Aug 17, 2022

@ElisaKendall @mereolog

  • 11: I don't think I want to use custom:hasCurrencyCode in my data. I want to use the standard property hasCurrency
  • 13: I know about fibo-fnd-acc-4217:ISO4217-CodeSet but it's a bit hard to use: sameAs links between currency-USD and currency-USDollar #1816
  • I've looked at some of the examples (Apple). I think I missed a few dates, will add them now

PS: did you write that RDF/XML by hand! I both admire and pity you :-)
Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FIBO instance data issue about FIBO instance data use case use case related issue Using FIBO how to use FIBO
Projects
None yet
Development

No branches or pull requests

3 participants