Plagiarism is a plug-in for the gh-edu ecosystem to detect plagiarism in programming assignments
Plagio es la copia servil o imitación torpe de un modelo, con pretensiones de originalidad.
Plagiarism is the slavish copying or clumsy imitation of a model, with pretensions of originality.
-- Correa y Lázaro
- Highly concurrent: Concurrency and parallelism with no bottle necks so you can get results as fast as possible
- Graceful degradation: If some process fails it tries to give you at least some results (moss URL, report file, graph image)
- Works well in scale: It doesn't matter how many assignments or students are in your organization, plagiarism finds the balance between speed and memory consumption
- Useful for scripting: It uses fzf to ask for user input but is totally functional trough CLI flags.
Plagiarism depends on the Stanford service MOSS, it clones all the repositories related to an assingment in an organization and sends it to MOSS service. Thereupon, it sends the result to a python script (mossum) that generates a graph.
- MOSS script
- Make sure it can be executed
chmod ug+x moss
- Make sure it can be executed
- Perl
- Python 3
- mossum script installed
- fzf (optional)
As a gh-edu plugin:
- Install as a
gh-edu
extensiongh edu install plagiarism
- Move or copy the moss script to the root directory
mv moss ~/.local/share/gh/extensions/gh-edu-plagiarism
- Get the binary
- Get the binary on releases
- Clone the repository and compile it (You will need go 1.18 or more recent)
- Use go install
go install https://github.com/gh-cli-for-education/gh-edu-plagiarism@latest
- Move or copy the moss script to the same directory of the binary
- c
- cc (C++)
- java
- ml (Meta Language)
- pascal
- ada
- lisp
- scheme
- haskell
- fortran
- ascii
- vhdl
- perl
- matlab
- python
- mips
- prolog
- spice
- vb (Visual Basic)
- csharp (C#)
- modula2
- a8086 (8086 assembly)
- javascript
- plsql (PL/SQL)
- verilog
It looks like the original creator of MOSS lost the source code and the server
is running on a binary. So it's very unlikely that more languages are added
https://www.quora.com/Why-is-the-MOSS-measure-of-the-software-similarity-algorithm-not-open-sourced
Extracted from the --help
flag
Usage:
gh edu plagiarism [flags]
Flags:
-a, --anonymize Indicate if you want to randomize the names
-c, --course string Specify the course/organization
-e, --exercise string Specify the regex for the assignment/exercise
-h, --help help for gh
-l, --language https://github.com/gh-cli-for-education/gh-edu-plagiarism#compatible-languages Select the language. You can treat this flag as a boolean or pass a string
-m, --min-lines int Minimum lines to show links (default 1)
-o, --output string Save the results in the specified path
-p, --percentage int Minimum porcentage to show links (default 1)
-q, --quiet No INFO in the output only the result
-t, --template Indicate if there is a tutor template
Plagiarism print a file with a report of all the possible pairs in the standard output, just redirect the output if you want to save that information.
gh edu plagiarism -q > report.txt
It also generate a very useful (and temporary) graph to get an overview, if you want to save it just use the --output flag to indicate the path
Otherwise it will be in your system temporary directory until the next execution
Go uses google's re2 regular expression engine, which have been designed for security and predictable performance
Sadly enough this means that look-arounds are not supported. Please keep in mind, when you are setting assignments regex
Links: