Skip to content

Jacob-Lord/RustSentimentScoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Jacob Lord

03/26/2025

Assignment

Social Sentiment Scoring - Rust Program

Q1 - 5 Short Answer Questions

1. How does the new language handle variable memory allocation, and what impact does this have on performance and memory safety?

The Rust programming language allocates memory for variables on the stack by default. With the variable being stored in the stack, the performance is very fast in contrast to if it were stored in the heap. With storage on the stack being the default, memory safety is protected as allocation and deallocation of the stack is managed directly by the system through its ownership and borrowing system. Every value has an owner, and ownership can be transferred and returned as necessary. This greatly assists in preventing memory related errors such as memory leaks and dangling objects by ensuring the memory that was in use is dropped once the variable's lifetime is finished without any programmer intervention. Unlike automatic garbage collection, like what is seen in Java, the ownership model has far greater runtime efficiency as there is not an entirely seperate process dedicated to cleaning up garbage because the programmer is forced to write clean, safe code otherwise the source code will not compile.

2. Describe the language’s approach to type systems and type binding time. Is the language statically or dynamically typed, and how does that affect error detection and program flexibility?

Rust's approach to type systems is that it is strongly typed to ensure program behavior executes as expected, and the binding of types occurs during compilation. It is statically typed which is implied with the type binding occurring during compilation, which allows for effective error detection during compilation. This means that the majority of errors in programs are caught by the Rust compiler (rustc) and the build process will be terminated until the programmer corrects the errors. The language being statically typed decreases the overall flexibility of the program developed in favor of reliability as data types cannot be coerced into alternative types. However, Rust does allow for the shadowing of variables by re-utilizing the "let" keyword to allow for implicit conversion except in the case where many data types could be possible for a method being used on the variable, in which case the data type must be explicitly stated. This shadowing functionality allows for increased flexibility even with the language being statically typed.

3. What are the subprogram calling conventions in the language? How are parameters passed (by value, by reference, etc.), and how does the return mechanism work?

The subprogram calling conventions in the Rust language consist of the function beginning with ```fn```  then a whitespace followed by the user-defined function name. The conventional style of naming function for rust is snake case. This function name is then followed by a pair of parenthesis to contain the function parameters. The body of the function is encapsulated by curly braces resulting in a function format resembling:

fn test_fxn() { /*execution body*/ }

Function parameters are in the format of the parameter name trailed by a colon then the data type of the parameter. Declaring the data type of each parameter is a requirement of the langauge. For example, if the parameter is a character, then the function would be of the form: 

fn test_fxn( x : char ) { /* execution body */}

For more than one parameter, a comma seperates each individual parameter. For example:

fn test_fxn( x : char, y : i32 ) { /* execution body */}

Rust function parameters are passed by value by default. Ownership of the value is passed to the function, so once the function finishes running the memory used to store the value is dropped and is unusable once the function ends unless the value is returned and stored in a different variable. For data types that implement the copy trait, a copy of the value is given to the function, and therefore original variable that was put into the function arguments is still usable after the function call finishes. The language also allows for passing arguments by reference by using the "&" symbol, so instead of providing the function ownership over the variable, the function can be provided access to use it without it being dropped once the function call finishes. The functionality of passing by reference in Rust is known as borrowing because ownership over the variable is not given, just access to the memory that is pointed to by the variable.

When defining the function signature, the return value's data type must be declared on the same line by pointing to it with the "->" symbol. An example of this would be:

fn test_fxn( x : char, y : i32 ) -> char { /* execution body */}

The return type for the example function is a character and this is how the compiler knows if the return value of a function is valid.The return mechanism in works by returning ownership over a variable back to whatever assignment statement was used with the function call. If no assignment statement was used to store the value returned by the function call then the variable is dropped at the end of it and becomes unusable. 

4. How does the language manage variable scope and lifetime—particularly for local, global, and any other static variables, if any?

    The language manages variable scope with curly braces like you would see in C. The lifetime of local variables are managed via the ownership system, so once a variable is out of scope its lifetime ends and it is cleared from memory. If ownership is not passed and the variable is simply borrowed into a function's scope then its lifetime extends beyond the scope of the function. There are no explicit global variables like you would see in other langauges such as Python, but there are static variables. Static variable's lifetimes exist for the entire duration of the program.

5. Identify one unique or innovative feature of the language and explain how it affects the way programmers write and structure code.

    The most unique feature of the Rust programming language is its owernship and borrowing system. With languages such as C, heap memory management is left as the programmer's responsibility; however, for Rust the ownership and borrowing systems handle the memory management and force the programmer to write safe code or else the source code will not compile. By eliminating the need for manual allocation and deallocation, the code written is less cluttered and easier to read. 

Short Report on the Program

    The Rust programming language is a low-level language whose focus is ensuring memory safety by forcing the programmer to write safe code by utilizing the ownership models and borrowing system. The owernship system utilizes a drop command on variables when they are no longer able to be used which is faster than incorporating a garbage collector as a seperate process like you would see in Java. However, with the ownership and borrowing system, it is incredibly important to understand the scope and lifetime of variables in this language as the compiler will not allow you to build the program if a variable is used out of its scope or lifetime. In this way, the language is very reliable and memory safe when used as intended. The use of pointers in other languages could potentially lead to pointer related issues such as dangling pointers, dangling objects, and memory corruption; however, Rust only allows you to use references and only allows them to be used correctly according to its borrowing system. Pointers are a feature in the language but the programmer must specify that they are writing unsafe code by utilizing the unsafe keyword in the program. Through the meticulous compiler checks, the language does a great job of forcing the programmer to follow the rules and create safe, reliable programs. 
    My first step in approaching the problem, besides reading and comprehending the assignment, was to spend a couple of hours reading through the Rust documentation page and gaining a stronger understanding of the features provided by the language, especially the ownership and borrowing system as those were the most unique aspects of Rust. Once I had a clear undertanding of how variable worked in the language I researched deeper into the File handling mechanisms of the language as that would be incredibly important for this specific problem. After some research I came to learn there was a specific crate (module for Rust) that was specifically designed to handle .csv files.
    The documentation for this crate left me a little confused due to its large amount of features and specifications, but I found a great tutorial video that demonstrated the differences between objects such as Reader and ReaderBuilder and how to utilize the features of the crate at a high level and its practical application. (https://www.youtube.com/watch?app=desktop&v=9w3rVqsdLRE) I did not finish the entire video, but after gaining a decent understanding of how the crate worked, I was able to implement a Reader object to successfully extract the .csv data into my program.
    Now I had to implement a dictionary to store the key-value pairs obtained from the .csv file, so I set out on researching the dictionary in the Rust language. It implements a HashMap crate to serve this purpose, so once I understood how the HashMap worked for the language I was able to create a function to form the sentiment table using these ideas. 
    With the table created, I went back to the File handling section of the Rust documentation to understand how to open the file and iterate through the content to analyze the overall sentiment. My first thought was that this would be the easiest part, but I realized my mistake when I printed out each word and some of them contained punctuation marks. With my understanding of regular expression, I implemented the Regex crate for the language and defined a pattern to capture only words which seemed successful at first. Then I noticed that contraction were being split into seperate words which altered the overall sentiment score incorrectly. Once I realized my mistake, I altered the pattern to identify different form of whitespaces in order to split the words on those points. This was the correct choice as now all the proper words were being assigned sentiment values. With that I successfully had a finished function to determine the sentiment score of a .txt file.
    After a quick search I figured out how to handle command line arguments in Rust, so I implemented a two-way branching statemnt that would either take the file as a command line argument or default to the review.txt file if no args were provided as specified in the assignment overview.
    The last step I took was creating a function to identify the star rating of the text based on its accumulated sentiment score. This was the easiest part as I only had to research the specifics concerning switch statements in the language. Rust had a different syntax for switch statements that I hadn't seen before, and it is called match instead of switch. No break statements were required for it and the default case is identified by an underscore which was a little different. After trial and error, I also learned that I had to assign a variable to hold the value returned by the match statement in order to utilize the result of it. Once the syntax was figured out, I was easily able to obtain a star value based on the accumulated score and print out the results.
    I liked learning about the ownership model and borrowing system since it was a new feature I had never had to work with before. It was very nice not having to constantly allocate and free memory, and instead leaving it to the system to drop when out of scope. I did not like the unwrap() method that had to be used with certain objects as it threw me for a loop initially. I found it difficult to work with, but I am sure with practice it would be easier to use and convenient when used in the correct context.
    I used ChatGPT to help me figure out certain errors in my code related to ownership and borrowing, and it really helped me gain a greater understanding of the system as it was able to explain it to me on a simpler level if I required. It was very helpful for answering detailed questions involved with the language when I couldn't properly understand the official documentation.

About

A scoring program written in Rust which utilizes the sentiment scoring table provided by Stanford University in order to rank text files based on the language used.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages