You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone. Really, really great project! The story behind and the focus on being user friendly regardless of very powerful features is fantastic!
I'm from @HXL-CPLP / @EticaAI and lately I'm looking for a document on how people could make use of HXLated (1) data, and Orange and its appeal really match with the idea of the user base of HXL Standard!
Context
The standard already has some known public dictionary (2) and would make sense to us here to add some functionality to improve auto-recognition. It doesn't need to be a core functionally (and, in fact, the automated selection of Orange tends to already give decent decisions) most of the time, BUT an add-on for Orange capable of understanding HXL would need to... Well, give stronger hints of affinity of what the data is!
The way HXL standard works, it would also mean depending on the attributes input dataset have, this would mean changing the data. An obvious example would be #meta be what is a meta on Orange (but this tends to already be a guess of Orange automation). I do not think would be necessary move a full featured HXL Proxy (see https://proxy.hxlstandard.org/) but for example at least some features such as how to explode an Well-Known Text like Point(12.34 56.79) into two columns of latitude and longitude I believe would be far easier create extensions than document users how to do this. Some things I guess is better we here try testing how some features would be done in Orange by composition of Widgets
Example: an very common task with HXL is user have an table with the focused data (let's say, people targeted for assistance, or people in need) and that data have reference column which mentions P-Codes (3), which would already be ready for use either online or ready for user download on disk. With Orange, since HXL is predictable, even if using the core functionally of merge datasets, the idea would be to propose the relational keys. However the more advanced case would be specialized plugging that already downloads the most popular reference data.
There's some features I'm not really sure how to do, but anyway I will already mention them here, even if they become relevant much later. By far, one of I would have no idea how to do it, but still the most relevant hints HXL could give to inferences would be that some columns already are about the same subject even if the numbers would not be obviously related. Simpler cases for example "source vs destiny", "doctor vs patient", but I could also think "hospital vs doctor vs nurse team vs patient" (because same doctors could work in more places) but if possible give hints of the natural groupings, this could allow unmasking averages. For example, by default I think Orange would make no distinction like about how many hours a professional is working, and that variable was about the doctor, not the nurse, or the patient. I think this would be possible to do this with more complex steps if the user already know very well what is trying to find, but maybe an HXL add-on plus example projects of making such hints for other steps on Orange could make predictions with less noise just because the data (before get into Orange) already was HXLated in ways to give hint the relations.
The initial idea with this open discussion
I think our group will draft an addon for Orange at least for some small changes to see how things fit together!
But in general, the way HXL works, the +attributes would become hints compared to plain CSV and the user, by nature of emergency response, tend to be under stress and dealing both with merging data which can be public (like reference to places) with their actual data, which is sensitive.
A common trend with data for humanitarian use is having both a time component and place component. Time is, well, is procedural, but ideally our user group is likely to also try to automate the generation of the reference tables and make them in sync with OCHA ones, so it could be viable to have sub-national borders. However, for sake of user simplicity, at least at administrative boundaries level 0 (often, but not always, a country), I think one nice-to-have would be to allow user have column with codes (such as ISO 3166-1) ready to use instead of need explicitly the latitude and longitude. Other things such as using UN m49 to infer continents could be somewhat useful too.
That's it for this open thread! Great project!
Footnotes
Note 1: about HXL
HXL heavily used for label humanitarian data (https://hxlstandard.org/how-it-works/). It both has command line tools, and a public proxy maintained by UN OCHA, which allows it as a sort of user friendly, yet very powerful extract, transform, load (ETL) frontend, while users can store source data edited collaboratively even on Google Sheets. One place to find HXLated datasets is https://data.humdata.org/dataset?ext_hxl=1. Our group is also publishing a stricter subset of HXL attributes with direct mapping to RDF, but the types of datasets with this format (while easier to parse) would lose a forgiving way for users to know how to tag manually.
Note 2: about HXL standard dictionary
The reference link is https://hxlstandard.org/standard/dictionary/. However, the way HXL works, even if at first it seems tabular and one dataset could always use the same exact strings, the way base #hashtags and +attributes works, are additive. So, a latitude could appears in dataset like #geo+lat or #loc+lat (by +lat attribute), or by #geo +coord (latitude and longitude on same field), but at same time an attribute such as +lat could appear as part of more complex header, such as #loc+dest+lat (location destination, for example to differentiate from #loc+origin+lat). The rule of trump would be: tooling that wants to allow user suggestions would tend to give preferences for more than one way of expressing the same thing, while ignoring +attributes which it does not know. But the way it works, the suggestions tend to be predictable, as in have a list of preferences to show to the user. One wonderful example of user interfaces which uses HXL as a hint is the https://hxldash.com/.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi everyone. Really, really great project! The story behind and the focus on being user friendly regardless of very powerful features is fantastic!
I'm from @HXL-CPLP / @EticaAI and lately I'm looking for a document on how people could make use of HXLated (1) data, and Orange and its appeal really match with the idea of the user base of HXL Standard!
Context
The standard already has some known public dictionary (2) and would make sense to us here to add some functionality to improve auto-recognition. It doesn't need to be a core functionally (and, in fact, the automated selection of Orange tends to already give decent decisions) most of the time, BUT an add-on for Orange capable of understanding HXL would need to... Well, give stronger hints of affinity of what the data is!
The way HXL standard works, it would also mean depending on the attributes input dataset have, this would mean changing the data. An obvious example would be
#meta
be what is a meta on Orange (but this tends to already be a guess of Orange automation). I do not think would be necessary move a full featured HXL Proxy (see https://proxy.hxlstandard.org/) but for example at least some features such as how to explode an Well-Known Text likePoint(12.34 56.79)
into two columns of latitude and longitude I believe would be far easier create extensions than document users how to do this. Some things I guess is better we here try testing how some features would be done in Orange by composition of WidgetsThere's some features I'm not really sure how to do, but anyway I will already mention them here, even if they become relevant much later. By far, one of I would have no idea how to do it, but still the most relevant hints HXL could give to inferences would be that some columns already are about the same subject even if the numbers would not be obviously related. Simpler cases for example "source vs destiny", "doctor vs patient", but I could also think "hospital vs doctor vs nurse team vs patient" (because same doctors could work in more places) but if possible give hints of the natural groupings, this could allow unmasking averages. For example, by default I think Orange would make no distinction like about how many hours a professional is working, and that variable was about the doctor, not the nurse, or the patient. I think this would be possible to do this with more complex steps if the user already know very well what is trying to find, but maybe an HXL add-on plus example projects of making such hints for other steps on Orange could make predictions with less noise just because the data (before get into Orange) already was HXLated in ways to give hint the relations.
The initial idea with this open discussion
I think our group will draft an addon for Orange at least for some small changes to see how things fit together!
But in general, the way HXL works, the
+attributes
would become hints compared to plain CSV and the user, by nature of emergency response, tend to be under stress and dealing both with merging data which can be public (like reference to places) with their actual data, which is sensitive.A common trend with data for humanitarian use is having both a time component and place component. Time is, well, is procedural, but ideally our user group is likely to also try to automate the generation of the reference tables and make them in sync with OCHA ones, so it could be viable to have sub-national borders. However, for sake of user simplicity, at least at administrative boundaries level 0 (often, but not always, a country), I think one nice-to-have would be to allow user have column with codes (such as ISO 3166-1) ready to use instead of need explicitly the latitude and longitude. Other things such as using UN m49 to infer continents could be somewhat useful too.
That's it for this open thread! Great project!
Beta Was this translation helpful? Give feedback.
All reactions