Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building a training set of tags for 8th #109

Open
iHiD opened this issue Oct 31, 2023 · 25 comments
Open

Building a training set of tags for 8th #109

iHiD opened this issue Oct 31, 2023 · 25 comments

Comments

@iHiD
Copy link
Member

iHiD commented Oct 31, 2023

Hello lovely maintainers 👋

We've recently added "tags" to student's solutions. These express the constructs, paradigms and techniques that a solution uses. We are going to be using these tags for lots of things including filtering, pointing a student to alternative approaches, and much more.

In order to do this, we've built out a full AST-based tagger in C#, which has allowed us to do things like detect recursion or bit shifting. We've set things up so other tracks can do the same for their languages, but its a lot of work, and we've determined that actually it may be unnecessary. Instead we think that we can use machine learning to achieve tagging with good enough results. We've fine-tuned a model that can determine the correct tags for C# from the examples with a high success rate. It's also doing reasonably well in an untrained state for other languages. We think that with only a few examples per language, we can potentially get some quite good results, and that we can then refine things further as we go.

I released a new video on the Insiders page that talks through this in more detail.

We're going to be adding a fully-fledged UI in the coming weeks that allow maintainers and mentors to tag solutions and create training sets for the neural networks, but to start with, we're hoping you would be willing to manually tag 20 solutions for this track. In this post we'll add 20 comments, each with a student's solution, and the tags our model has generated. Your mission (should you choose to accept it) is to edit the tags on each issue, removing any incorrect ones, and add any that are missing. In order to build one model that performs well across languages, it's best if you stick as closely as possible to the C# tags as you can. Those are listed here. If you want to add extra tags, that's totally fine, but please don't arbitrarily reword existing tags, even if you don't like what Erik's chosen, as it'll just make it less likely that your language gets the correct tags assigned by the neural network.


To summarise - there are two paths forward for this issue:

  1. You're up for helping: Add a comment saying you're up for helping. Update the tags some time in the next few days. Add a comment when you're done. We'll then add them to our training set and move forward.
  2. You not up for helping: No problem! Just please add a comment letting us know :)

If you tell us you're not able/wanting to help or there's no comment added, we'll automatically crowd-source this in a week or so.

Finally, if you have questions or want to discuss things, it would be best done on the forum, so the knowledge can be shared across all maintainers in all tracks.

Thanks for your help! 💙


Note: Meta discussion on the forum

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: hello-world

Code

: hello-world \ -- s
"Hello, World!"

Tags:

paradigm:imperative
paradigm:procedural
paradigm:concatenative
construct:colon-definition
construct:string
uses:informal-sed

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: armstrong-numbers

Code

: armstrong? \ n -- 

Tags:

No tags generated

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: isogram

Code

: isogram? \ s -- T

Tags:

construct:backslash
construct:colon
construct:word
construct:parameter
construct:tagged-template-string
construct:visibility-modifiers
paradigm:declarative
paradigm:functional
technique:higher-order-functions
uses:BackslashInString

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: acronym

Code

: acronym \ s -- s
  /[^a-zA-Z']/ s:/ ( 0 1 s:slice s:uc ) a:map "" a:join
;

Tags:

No tags generated

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: gigasecond

Code

: +gigasecond \ s  s

Tags:

construct:add
construct:gigasecond
construct:keyword
construct:parameter
construct:space
construct:variable
paradigm:imperative
paradigm:functional

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: trinary

Code

: num> \ c -- n
    dup 48 = if drop 0 ;then
    dup 49 = if drop 1 ;then
    dup 50 = if drop 2 ;then
    null 
;

: do_trinary> \ s -- n
    dup "" = if 0 ;then
    0 s:@ num>
    swap s:len 1 - 3 swap n:^ rot n:*
    swap 1 -1 s:slice null? if drop ;then
    recurse n:+ 
;

:  trinary> \ s -- n
     dup ( swap null? if 2drop null else drop num> then ) 0 s:reduce
     null? if 3drop 0 ;then
     drop do_trinary>
;

Tags:

construct:assignment
construct:boolean
construct:char
construct:comment
construct:define
construct:drop
construct:if-else
construct:if-elseif
construct:invocation
construct:lambda
construct:local
construct:method
construct:multiply
construct:null
construct:nullability
construct:number
construct:parameter
construct:recursive-call
construct:subtract
construct:swap
construct:then
construct:variable
paradigm:functional
paradigm:imperative
paradigm:object-oriented
technique:higher-order-functions
technique:recursion

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: darts

Code

: hypot \ x y -- sqrt(x^2 + y^2)
  n:sqr swap n:sqr n:+ n:sqrt ;
: n:<= \ x y -- T
  2dup n:= -rot n:< or ;
: darts-score \ n n -- n
  hypot
  dup  1 n:<= if drop 10 ;then
  dup  5 n:<= if drop  5 ;then
  dup 10 n:<= if drop  1 ;then
  drop 0
;

Tags:

construct:add
construct:boolean
construct:drop
construct:floating-point-number
construct:if-then-else
construct:invocation
construct:lambda
construct:number
construct:or
construct:parameter
construct:stack
construct:swap
construct:word
paradigm:functional
paradigm:stack-oriented
technique:boolean-logic
uses:add

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: darts

Code

with: n
: darts-score \ n n -- n
   sqr swap sqr + \ Sum of x² + y²
   [10, 5, 1, 0] \ scores for inner, middle, outer, and beyond rings
   [1, 5, 10] ' sqr a:map \ radius² of inner, middle, outer rings
   rot
   ( > if 1 else -1 then ) \ less-than-or-equal
   a:pigeon \ Score of ring where x² + y² <= radius²
;
;with

Tags:

construct:add
construct:annotation
construct:backslash
construct:char
construct:comment
construct:define
construct:if-else
construct:invocation
construct:lambda
construct:list
construct:map
construct:number
construct:parameter
construct:tuple
construct:variable
construct:visibility-modifiers
paradigm:functional
paradigm:object-oriented
technique:higher-order-functions

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: atbash-cipher

Code

: encode-char \ c -- c
  dup 'a 'z n:between if
    'z swap n:- 'a n:+ 
  then
  ;

\ decode FROM a cipher
: atbash> \ s -- s
  /[^[:alnum:]]/ "" s:replace! s:lc
  ' encode-char s:map
  ;

\ encode TO a cipher
: >atbash \ s -- s
  atbash>
  [5] s:/ " " a:join
  ;

Tags:

construct:char
construct:comment
construct:define
construct:invocation
construct:lambda
construct:method
construct:parameter
construct:pattern
construct:quotations
construct:stack
construct:word
paradigm:functional
paradigm:reflective
technique:higher-order-functions

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: resistor-color

Code

: colors \ -- a
  ["black", "brown", "red", "orange", "yellow", "green", "blue", "violet", "grey", "white"] ;
: color-code \ s -- n
  colors swap ' s:= a:indexof nip ;

Tags:

construct:assignment
construct:char
construct:comment
construct:invocation
construct:lambda
construct:list
construct:parameter
construct:quotations
construct:stack
construct:swap
construct:word
paradigm:functional
paradigm:imperative
paradigm:reflective
technique:higher-order-functions

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: etl

Code

: keys-to-values \ m key value -- m
    swap >n >r
    ( s:lc r@ m:! ) a:each!
    drop rdrop
;

: transform \ m -- m
    m:new swap
    ' keys-to-values m:each
    drop
;

Tags:

construct:comment
construct:definition
construct:drop
construct:invocation
construct:keyword
construct:parameter
construct:quotations
construct:stack
construct:swap
construct:word
paradigm:functional
paradigm:reflective
technique:higher-order-functions

@iHiD
Copy link
Member Author

iHiD commented Oct 31, 2023

Exercise: pop-count

Code

: eggCount  \ n -- n
  0 >r             \ keep the count on the r-stack
  repeat
    dup 0 n:= if   \ stop when n = 0
      break
    else
      dup 1 band   \ isolate zeroth bit 
      r> n:+ >r    \ add to count
      1 shr        \ and shift the input number
    then
  again
  drop r>          \ drop the number and pull the count from r-stack
;

Tags:

construct:assignment
construct:band
construct:break
construct:comment
construct:drop
construct:if
construct:invocation
construct:method
construct:number
construct:parameter
construct:repeat
construct:stack
construct:then
construct:word
paradigm:stack-based

@axtens
Copy link
Member

axtens commented Nov 1, 2023

Everything after "\ " is a comment. So the guesses are a bit wide of the mark sometimes

@iHiD
Copy link
Member Author

iHiD commented Nov 1, 2023

Everything after "\ " is a comment. So the guesses are a bit wide of the mark sometimes

Yes, I'd expect lower-frequency languages like 8th to be less accurate, but once the tags are updated and we retrain it, it should learn :)

@ErikSchierboom
Copy link
Member

This is an automated comment

Hello 👋 Next week we're going to start using the tagging work people are doing on these. If you've already completed the work, thank you! If you've not, but intend to this week, that's great! If you're not going to get round to doing it, and you've not yet posted a comment letting us know, could you please do so, so that we can find other people to do it. Thanks!

@glennj
Copy link
Contributor

glennj commented Nov 14, 2023

@axtens I will not be able to help out with this.

@axtens
Copy link
Member

axtens commented Nov 15, 2023 via email

@axtens
Copy link
Member

axtens commented Nov 15, 2023 via email

@ErikSchierboom
Copy link
Member

@axtens Do you know when you'll have time?

@axtens
Copy link
Member

axtens commented Nov 21, 2023 via email

@axtens
Copy link
Member

axtens commented Nov 21, 2023

@iHiD @ErikSchierboom you seem to have mis-scanned some. There is a full-blown version in the .meta folder but you seem to have taken some of the examples from the root of the task rather than its .meta. +gigasecond for example.

@ErikSchierboom
Copy link
Member

you seem to have mis-scanned some.

Yes, sorry about that!

@ErikSchierboom
Copy link
Member

@axtens Did you get a chance to work on this? If not, that's fine too, and we can do it later.

@axtens
Copy link
Member

axtens commented Nov 23, 2023 via email

@axtens
Copy link
Member

axtens commented Nov 25, 2023

--hello-world
paradigm:imperative
paradigm:procedural
paradigm:concatenative
construct:colon-definition
construct:string
uses:informal SED
--two-fer
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:boolean-logic
construct:colon-definition
construct:if
construct:boolean
construct:null
construct:string
uses:informal SED
uses:string-format
--bob
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:boolean-logic
technique:regular-expression
construct:logical-and
construct:logical-not
construct:colon-definition
construct:if
construct:boolean
construct:number
construct:string
construct:int
construct:array
uses:informal SED
--leap
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:boolean-logic
construct:colon-definition
construct:if
construct:divide
construct:boolean
construct:number
construct:int
uses:informal SED
--triangle
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:boolean-logic
construct:colon-definition
construct:if
construct:add
construct:subtract
construct:boolean
construct:number
construct:int
uses:informal SED
uses:r-stack
--collatz-conjecture
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:bit-manipulation
technique:recursion
construct:colon-definition
construct:if
construct:add
construct:bitwise-and
construct:multiply
construct:subtract
construct:boolean
construct:null
construct:number
construct:int
uses:informal SED
--armstrong-numbers
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:boolean-logic
technique:higher-order-functions
technique:type-conversion
construct:explicit-conversion
construct:lambda
construct:variable
construct:colon-definition
construct:add
construct:exponentiation
construct:boolean
construct:null
construct:number
construct:string
construct:int
construct:array
uses:informal SED
--isogram
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:boolean-logic
technique:higher-order-functions
technique:regular-expression
technique:sorting
construct:explicit-conversion
construct:lambda
construct:colon-definition
construct:boolean
construct:string
construct:array
uses:informal SED
--acronym
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:higher-order-functions
technique:regular-expression
construct:explicit-conversion
construct:lambda
construct:colon-definition
construct:string
construct:array
uses:informal SED
--gigasecond
paradigm:imperative
paradigm:procedural
paradigm:concatenative
construct:explicit-conversion
construct:colon-definition
construct:add
construct:date-time
uses:informal SED
--trinary
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:higher-order-functions
construct:colon-definition
construct:break
construct:if
construct:add
construct:multiply
construct:subtract
construct:exponentiation
construct:boolean
construct:number
construct:string
construct:int
uses:informal SED
uses:r-stack
--darts
paradigm:imperative
paradigm:procedural
paradigm:concatenative
construct:colon-definition
construct:if
construct:add
construct:exponentiation
construct:boolean
construct:number
construct:int
construct:float
uses:informal SED
--series
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:boolean-logic
technique:higher-order-functions
construct:colon-definition
construct:if
construct:number
construct:string
construct:int
construct:array
uses:r-stack
--atbash-cipher
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:boolean-logic
technique:higher-order-functions
technique:regular-expression
construct:explicit-conversion
construct:colon-definition
construct:if
construct:subtract
construct:boolean
construct:number
construct:string
construct:int
uses:informal SED
--resistor-color
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:boolean-logic
construct:indexer
construct:colon-definition
construct:number
construct:string
construct:int
construct:array
uses:informal SED
--etl
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:higher-order-functions
construct:explicit-conversion
construct:lambda
construct:colon-definition
construct:string
construct:array
construct:dictionary
uses:informal SED
--pop-count
paradigm:imperative
paradigm:procedural
paradigm:concatenative
technique:bit-manipulation
technique:bit-shifting
technique:boolean-logic
construct:colon-definition
construct:repeat-until-loop
construct:add
construct:bitwise-right-shift
construct:number
construct:int
uses:informal SED

@exercism exercism deleted a comment from iHiD Nov 28, 2023
@exercism exercism deleted a comment from iHiD Nov 28, 2023
@exercism exercism deleted a comment from iHiD Nov 28, 2023
@exercism exercism deleted a comment from iHiD Nov 28, 2023
@exercism exercism deleted a comment from iHiD Nov 28, 2023
@exercism exercism deleted a comment from iHiD Nov 28, 2023
@exercism exercism deleted a comment from iHiD Nov 28, 2023
@exercism exercism deleted a comment from iHiD Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants