Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
- Adopt Typescript
- Partition script execution
Right now the application processes, asynchronously, all selected files at once (the command arguments of the processing script is the whole selected file path list). This choice was made to avoid the spawn of multiple processes, because their overhead could make the total processing slower. However, as the number and size of the files increases, this overhead becomes negligible when opposed to the cost of simultaneous process. Apparently, with a big dataset, the whole system could freeze (especially if the size of RAM is small). So a partition and serialization of the whole load are needed, based on a metric that would be best for the overall performance (e.g. partition by the number of files or by the size of files).
- Add asynchronous display of results
The results are fetched and displayed altogether when all scripts have finished executing (which means that all script-spawning promises have been resolved). This means that if the workload is comprised of many scripts, the display is bottlenecked by the performance of the slowest one. Also, in case of partitioned execution (above feature), asynchronous display is vital, as the total duration of execution might be many times longer than that of one batch.
- Revisit processing scripts' structure, used packages, language and running environment
Currently, the built-in scripts are written in R using quanteda and koRpus packages. Apart from code optimizations, there are many possible alterations, like switching to python and its modules or using different running environment (instead of spawning processes at a command prompt, scripts could be executed on an R or python console that would run and stay open alongside the app).
- Create project and workspace features
Organizing workflow in projects and workspaces is very common in apps. This isn't supported by the application currently, but instead, every instance works under a single "workspace", which means that added files, stored results and custom scripts are shared between different instances of the application.
- Integrate more indices
The set of built-in indices is satisfying for the current phase of the program. But, by no means, it is complete. Therefore, we want the integration of more indices, like n-grams.
- Improve the application’s appearance
Having a stylish appearance is a vital asset in building user-friendly GUIs and, therefore, appealing applications. So, it is considered urgent to perform a “fashion makeup”, by adding new elements to the program (like a loading circle when executing the scripts, or a progress bar), by including different color themes and font sizes and by giving the user the ability to change the sizes of the different elements, with sliders.
- Add tooltips
Tooltips are so important to every application as they inform the user in, literally, everything, making the use of the application a piece of cake. Unfortunately, right now, the application is lacking them, making it a bit difficult to use.