-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add infix operators for relational algebra #8036
Conversation
I admire your steadfast commitment to exploring the final frontiers of Unicode, @jiahao. |
These are the voyages of the flagship Julia. Its five-year mission: to explore strange new symbols, to seek out new codepoints and new characters, to boldly go where no language has gone before. |
I'm loling on a bus over this exchange. I do like all these Unicode symbols. |
The Julia language boldly exploring new frontiers of untypeability ;) |
Except that we've simultaneously expanded the frontiers of typeability. |
That's actually a good point. How do people feel about (Note that |
Shouldn't join have the same precedence as union, i.e. have + precedence? The standard LaTeX name seems to be Since the tab completion is mostly autogenerated from |
Aren't the In general, we might think about importing unicode-math-table.tex. See #8044. |
+1 for using unimath names. Except why is |
298cf34
to
14c55d7
Compare
I wasn't aware of unimath; that makes things easy. I've updated this PR with a better choice of precedence class (multiplication; table joins are essentially equivalent to matrix products) |
Can you also update the LaTeX table by re-running the unimath script in |
@@ -72,6 +71,7 @@ static int is_wc_cat_id_start(uint32_t wc, utf8proc_propval_t cat) | |||
(wc >= 0x2a00 && wc <= 0x2a06) || // ⨀, ⨁, ⨂, ⨃, ⨄, ⨅, ⨆ | |||
(wc >= 0x2a09 && wc <= 0x2a16) || // ⨉, ⨊, ⨋, ⨌, ⨍, ⨎, ⨏, ⨐, ⨑, ⨒, ⨓, ⨔, ⨕, ⨖ | |||
wc == 0x2a1b || wc == 0x2a1c)))) || // ⨛, ⨜ | |||
wc == 0x2a1d || (wc >= 0x27d5 && wc <= 0x27d7) || //joins: ⨝ ⟕ ⟖ ⟗ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If joins are infix operators, then they don't also need to go in is_cat_id_start
. Normally we put them in one place or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bump @jiahao.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump. spaces, not tabs too
14c55d7
to
3f35762
Compare
Rebased and updated with all comments. I combined and partially rewrote the scripts in the comment block of latex_symbols.jl so that only a single script needed to be run. |
Is the travis failure related? |
Are some symbols gone? e.g. I can't find |
6c7c7e3
to
1a4c02f
Compare
Bump. I think we need |
3f35762
to
f1e884d
Compare
Rebased and updated. |
@@ -99,6 +99,9 @@ static int is_wc_cat_id_start(uint32_t wc, utf8proc_propval_t cat) | |||
(wc >= 0x2220 && wc <= 0x2222) || // ∠, ∡, ∢ | |||
(wc >= 0x299b && wc <= 0x29af) || // ⦛, ⦜, ⦝, ⦞, ⦟, ⦠, ⦡, ⦢, ⦣, ⦤, ⦥, ⦦, ⦧, ⦨, ⦩, ⦪, ⦫, ⦬, ⦭, ⦮, ⦯ | |||
|
|||
// geometric shapes | |||
(wc >= 0x25a0 && wc <= 0x25ff) || | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't many of these already covered by the cat == UTF8PROC_CATEGORY_SO
check? It seems like the only ones in category Sm (which have to be special-cased) are U+25F8 to U+25FF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It's useful to do the minimal amount of special-casing here for people in other contexts trying to write Julia lexers with regexes etc., e.g. for syntax highlighting in editors.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. I just tried some simple function definitions without this commit and it works. I'll take it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, you don't need it at all, because there is already wc >= 0x25F8 && wc <= 0x25ff
above in addition to UTF8PROC_CATEGORY_SO
.
f1e884d
to
f326c25
Compare
"\\rightouterjoin" => "⟖", # right outer join | ||
"\\fullouterjoin" => "⟗", # full outer join | ||
"\\Join" => "⨝", # join | ||
"\\mathunderbar" => "̲", # combining low line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just \underbar
? math
is kind of redundant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
math is kind of redundant here.
True; that was the name in unicode-math-table.tex
. I can change it manually.
2d09759
to
ead4141
Compare
I've modified the latex symbol parsing script to strip |
ead4141
to
ec96896
Compare
Note: ▷ is semantically a geometric shape, but I could not find a separate character for antijoin
LGTM once commits are squashed and tests are green. |
- Minor clean up of latex_symbols generator script - Update list of latex symbols
ec96896
to
93668d1
Compare
Add infix operators for relational algebra
A bit confused, does it mean that these symbols are now defined as relational algebra operators and can be used to perform joins in Julia (since relational algebra was brought up), or we are only talking about including these symbols into set of valid Julia characters? |
These are recognized by the parser as valid infix operators, but no default implementation of them is given in base. Packages and user code are free to do so. (Though some coordination is called for if packages are defining methods for them on Base types.) |
This PR includes yet moar Unicode; this time, primarily to support relational algebra.