Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite Mainframer in Rust. #191

Merged
merged 20 commits into from
Mar 1, 2018
Merged

Rewrite Mainframer in Rust. #191

merged 20 commits into from
Mar 1, 2018

Conversation

artem-zinnatullin
Copy link
Contributor

@artem-zinnatullin artem-zinnatullin commented Feb 25, 2018

Sooo, yeah lol.

General reasons to abandon Bash:

  • Upcoming "Continuous sync" feature Continuous sync from remote to local machine during remote command execution. #188 that should significantly speed up most of workflows needs proper multithreading which is hard to do with Bash
  • Bash makes simple things hard, like substrings, collections, etc
  • Bash lacks simple programming apis which force us to use other tools like grep which work differently on GNU (Linux) and BSD (macOS) and drive us crazy
  • Right from start we knew this day will come, it was just a matter of time, Bash served us well till this day though

Ok, but why Rust:

  • Rust is a cross-platform system language
  • Rust is stable (1.x), sorry Kotlin Native
  • Rust doesn't have GC - less runtime overhead, sorry Go
  • Rust in general is super fast and follows "Zero-cost abstractions" idea, however Mainframer bottlenecks on network and remote command execution
  • Rust is memory safe (with pretty safe multithreading as a side-effect)
  • Rust is my personal hipster choice for 2018, sorry Kotlin (Native too) you're not a hipster choice anymore after 2017 Google announcement
  • Rust has very good set of design decisions and a compiler that tries very hard to catch errors before runtime and explain them so you don't have to google
  • Rust has very well written documentation

Interesting side-effects:

  • We now have not only Integration tests (which Rust version passes with breaze), but also Unit tests! yay
  • We can now add much more complicated features like Continuous sync from remote to local machine during remote command execution. #188 mentioned above, parameters parsing, file parsing and whatever else we might need in future
  • Mainframer is now 2x faster than version written in Bash on a simple benchmark mainframer echo yo with no files to sync and localhost as remote machine. In concrete numbers: 0.02 - 0.03 seconds instead of 0.05 - 0.06 seconds and 6% CPU use instead of 9% which is just nice to have

How to catch-up with this PR and reason about Rust code:

Edit: grammar, add a note about IDE

@artem-zinnatullin
Copy link
Contributor Author

This PR also fixes #190 log limit on Travis as I was running tests too often and it just annoyed the hell out of me.

@@ -0,0 +1,6 @@
[package]
name = "mainframer"
version = "2.1.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future I'd like to detach version from source code and get it from current git tag, but we didn't do it with Bash so for simplicity I didn't do it in this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to open an issue about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeeep, here you go #192

eval $ANDROID_SDK_INSTALL_COMPONENT '"platform-tools"' > /dev/null && \
eval $ANDROID_SDK_INSTALL_COMPONENT '"build-tools;25.0.2"' > /dev/null && \
eval $ANDROID_SDK_INSTALL_COMPONENT '"platforms;android-25"' > /dev/null && \
chmod -R 777 "$ANDROID_HOME"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved it here, previously we gave ownership of this folder to the build user in entrypoint script which was slowing down iterative testing as it was running every time on container startup

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chmod a+rwx, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

777, ruly?

@nostra13 sing it to @ming13:

Chmod777:

Podnyal chmoda, stali drugimi prava
Stali schitatsya so mnoy
Znayut kto teper chown

Podnyal chmoda, filesystem v liniu dala
U tebya talant bratan
Kkakoy?
Nastraivat linuxa

https://www.youtube.com/watch?v=qP303vxTLS8


Ignore {
common_ignore_file: match common_ignore_file.exists() {
true => Some(common_ignore_file.to_path_buf()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do pattern-matching for booleans in all languages :trollface:

src/args.rs Outdated
command: {
let str: String = raw_args
.iter()
.cloned()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you own raw_args you can avoid cloning by calling .into_iter(). This will consume raw_args and turn it into an iterator that yields owned Strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! didn't know about this option, thx!

src/args.rs Outdated
.iter()
.cloned()
// Add space between parameters.
.map(|arg| format!("{} ", arg))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, this isn't important, but this code will allocate a new String for result. This is sorta wasteful given that you already have a owned String that you can modify here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I would thought that Compiler will be able to optimize it here

src/args.rs Outdated
.cloned()
// Add space between parameters.
.map(|arg| format!("{} ", arg))
.collect();
Copy link

@pepyakin pepyakin Feb 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but better, I guess you are looking for something like Kotlin's joinToString. Rust also have similar method.
It is unfortunate this function is not so discoverable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

holy moly, this is now legit oneliner, THX!

src/time.rs Outdated
@@ -0,0 +1,141 @@
use std::time::Duration;

pub fn format_duration(duration: &Duration) -> String {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, you don't actually want to pass Duration by a value since it is a Copy type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please discuss that more? :) I think I'm missing something

Docs: https://doc.rust-lang.org/std/marker/trait.Copy.html

So as I see it - I can avoid copying if I don't need it because compiler allows me to do it safely, here with duration: &Duration I'm passing value by reference which only allocates additional reference that points to same struct

If I change function to duration: Duration, whenever I'd call it - a full copy of Duration struct will be allocated as well as new reference to it right?

If so, why should I copy if I don't really need a copy here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yeah, you're right.

Maybe it's just me, my rule of thumb: if I have a Copy type then I just pass it by value since it in many cases is more convenient.

If I change function to duration: Duration, whenever I'd call it - a full copy of Duration struct will be allocated as well as new reference to it right?

Actually it's very compilcated thing : ) But in the end, I think that there will be no difference in the generated machine code, since LLVM very likely will promote this reference to a value and pass it by a copy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(As we say in russian: "Я парень простой, вижу Copy тип — передаю его по значению")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahah

Soo, indeed LLVM seems to be able to elide copy, however it's not guaranteed and there are cases where it doesn't do it

If you don't mind I'd keep it as &Duration for now, I'll need more Rust experience to get a full sense of when I want implicit copy or not :)

That was super helpful discussion though, really appreciate it :)

Useful links for myself:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I meant the other way around: I bet that LLVM should convert references to a copies, because this usually opens new opportunities for optimizations (IIRC, this is called argument promotion).

E.g in this code LLVM will promote the reference to the values, two values to be exact: secs and nanos. And since you don't use nanos in this code it will be elided altogether.

But yeah, LLVM also can elide big copies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wuuut… it indeed can https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/IPO/ArgumentPromotion.cpp

okay, lemme switch to copy then lol

src/time.rs Outdated
let mut text = String::new();

if hours > 0 {
text.push_str(&format!("{value} {label}", value = hours, label = hours_label));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you import std::fmt::Write then you will be able to use a write! macro for formatting strings directly into a text without an extra String being allocated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doooope, thx!

src/sync.rs Outdated
fn apply_exclude_from(rsync_command: &mut Command, exclude_file: &Option<PathBuf>) {
match exclude_file.clone() {
Some(value) => {
rsync_command.arg(format!("--exclude-from={}", value.to_string_lossy()));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use something like?

match exclude_file {
    Some(ref value) => {
         // ...
         value.to_string_lossy()
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wuuuuut, noice

impl Args {
pub fn parse(raw_args: Vec<String>) -> Result<Args, String> {
match raw_args.len() {
0 => Err(String::from("Please pass remote command.")), // TODO more user friendly message, for now it's consistent with Bash version.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, String::from(x) is equivalent to x.into().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see the pattern here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still like the String::from though, what do you think is more idiomatic?

String docs use from() heavily, that's why I used it https://doc.rust-lang.org/book/second-edition/ch08-02-strings.html

But into() seems to be a common function across different types as well, maybe I should use from() for "static" values and into() for dynamic?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of any convention here. From and Into are duals to each other in some sense.
You are free to use either String::from(x), x.to_string(), x.to_owned() and x.into() and they will do the same thing, but I have an impression that x.into() is more common.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn Rust, String is most confusing type so far, so many options jeez (and I haven't really used Cow yet btw)

Let's stick to from() for now since docs use it? I'd probably get a sense of convention as I read/write more code, googling didn't help much, I might also ask this later on /r/rust sub to get a sense of what community thinks about it :)

@artem-zinnatullin
Copy link
Contributor Author

@pepyakin big thanks for review! Really appreciate it, will address your comments

I've definitely haven't gotten all quirks of reference and standard type system yet. I believe I can make function parameters more generic in terms of ownership on the data passed to a function and definitely need to learn more ways of using standard functions and types

Will be happy for another round of review from you once I fix already posted feedback!

src/time.rs Outdated

#[test]
fn format_duration_0_seconds() {
assert_eq!(format_duration(&Duration::from_secs(0)), String::from("0 seconds"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you can move the assertion to a function named assert_format which will have seconds and expectedText as arguments.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, assert_eq! takes a str as the second argument. So you can use just "0 seconds".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed!

src/time.rs Outdated
use std::time::Duration;

pub fn format_duration(duration: &Duration) -> String {
let raw_seconds = duration.as_secs();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total_seconds

pushd "$DIR/../" > /dev/null

echo "Building debug version of Mainframer..."
cargo build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a good idea to include some additional debug flags, like -W or something? Maybe even use Clippy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep for sure, let's do it in separate PR? I'd like to fail on warnings (currently 0) and enforce more checks like Clippy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #193 to track that

src/time.rs Outdated
text.push(' ');
}

text.push_str(&format!("{value} {label}", value = seconds, label = seconds_label));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting {value} {labels} asks for moving to a separate function.

Instead of doing String concat yourself you’ll be better off using join. In Kotlin it will be listOf("hours", "minutes").joinToString(separator = " ").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, will move

src/time.rs Outdated
}

#[cfg(test)]
mod tests {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a common practice in Rust to have tests in the same file as the main source set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like so, that's how documentation suggests it and how other projects like ripgrep do it

I've tried to move tests into test folder, but looks like it's suitable only for declaring common test code but not tests themselves

@pepyakin am I right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is very common way to structure test code. And at times it is the only possible one, since submodules can access it's parent module items.

use std::process::Command;

// TODO add internal version of sync functions with closures as parameters to unit test properly.
pub fn sync_local_to_remote(local_project_dir_name: &str, config: &Config, ignore: &Ignore) -> Result<(), String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync_remote_to_local and sync_local_to_remote seem awfully similar. Most likely they can be unified somehow. At least the result part is the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result part is for sure pretty much same, will refactor,
however I wouldn't combine sync logic as they use different ignore files and specify few rsync parameters completely different

.arg(format!("--rsync-path=mkdir -p {} && rsync", project_dir_on_remote_machine(local_project_dir_name)))
.arg(format!("--compress-level={}", config.local_compression_level));

apply_exclude_from(&mut command, &ignore.common_ignore_file);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t like the apply_exclude_from function as a side effect here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally understand your frustration here as I'm also used to defensive copy everywhere especially with Kotlin

However, this is kinda whole point of Rust, it makes mutability compile-time safe. As you can see I pass command as &mut which clearly indicates that it can be mutated inside apply_exclude_from() function

Compiler checks that at any given point of time there is no data races and that only one piece of code owns the data, so such mutability is specifically built into the language to make it as fast as possible and avoid copying

Moreover, Command type is built to be mutated, method arg() actually modifies Command and returns self and not a new instance of Command

Same applies to all complex data types in Rust like String is also mutable, exception is only primitive types like u32 that are typically cheaper to copy than mutate

let mut command = Command::new("ssh");

command
.arg(config.remote_machine_name.clone())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you need clone here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I don't own config here, I only have a reference to it so I need to explicitly copy

https://doc.rust-lang.org/book/second-edition/ch04-01-what-is-ownership.html

src/main.rs Outdated

let start = Instant::now();

let result = execute_remote_command_impl(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust doesn't support function overload :( :( :(

So I can't give these functions a same name even though arguments are completely different

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a better name for impl function on mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pepyakin how do you typically deal with this problem? I tend to prefer functionalish way of programming for code like this by avoiding unnecessary struct/objects and working with just top level functions

However in many cases that means that I have a "shell" version of a function that does some side-effects like printing to stdout, and sort of "internal" version of a function that is more pure

But Rust doesn't let my functions have same name :(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is broad problem I think, and sometimes to overcome rust lack of overloading it just comes down to choosing appropriate names.

However, in this case I can propose to use something like this

use remote_command;

remote_command::execute_remote_command(
  // ...
);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's even not uncommon to choose name for inner function assuming it will be usually called thru the module.

One example that comes to my mind is std::env::args().
args is pretty ambigious name in itself. However, it is very common to call it like

use std::env;

env::args()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat! I like this way much better, will do same for sync, thanks 👍

src/main.rs Outdated
let result = execute_remote_command_impl(
&args.command.clone(),
config,
&format!("~/mainframer/{}", working_dir.file_name().unwrap().to_string_lossy().clone()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don’t really like that the location is being repeated here and in the copy function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point, lemme refactor that, thx

src/main.rs Outdated
use time::*;

fn main() {
println!(":: Mainframer v2.1.0\n");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version is duplicated here is there some way to pass in values into rust from tooling. e.g. custom defines in c++ or BuildConfig from Java.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, cargo publishes package version thru build-time environment variables.
https://doc.rust-lang.org/cargo/reference/environment-variables.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep yep, in plans in a separate PR, thanks for link!

@artem-zinnatullin
Copy link
Contributor Author

@pepyakin btw feel free to "approve" or "request changes", that works even if you're not a member of the repo, your feedback is super valuable and I definitely owe you couple beers or something :D

@@ -0,0 +1,6 @@
[package]
name = "mainframer"
version = "2.1.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to open an issue about it?

@@ -20,9 +20,6 @@ if [ "$USER_ID" == "0" ]; then
echo "Warning: running as r00t."
fi

# Run shellcheck.
docker run --rm --env SHELLCHECK_OPTS="--exclude SC2088" --volume `"pwd"`:/scripts:ro koalaman/shellcheck:v0.4.6 /scripts/mainframer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank God.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, actually I'm planning to add it back later to check our integration tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #194 to track that

eval $ANDROID_SDK_INSTALL_COMPONENT '"platform-tools"' > /dev/null && \
eval $ANDROID_SDK_INSTALL_COMPONENT '"build-tools;25.0.2"' > /dev/null && \
eval $ANDROID_SDK_INSTALL_COMPONENT '"platforms;android-25"' > /dev/null && \
chmod -R 777 "$ANDROID_HOME"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chmod a+rwx, please.

src/time.rs Outdated

#[test]
fn format_duration_2_minutes() {
assert_eq!(format_duration(&Duration::from_secs(120)), "2 minutes 0 seconds"); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting went off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed

text.push(' ');
}

push_value_and_label(&mut text, seconds, seconds_label);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are still joining strings yourself...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@artem-zinnatullin
Copy link
Contributor Author

Addressed remaining feedback, created few issues for changes I plan to do in separate PRs, going to merge it after CI

@artem-zinnatullin
Copy link
Contributor Author

@pepyakin SUPER BIG THANKS to you for review, that was very helpful and fun 😽

Gotta 🍻 when we have chance!

@ming13 thanks for review as well 😽

eval $ANDROID_SDK_INSTALL_COMPONENT '"platform-tools"' > /dev/null && \
eval $ANDROID_SDK_INSTALL_COMPONENT '"build-tools;25.0.2"' > /dev/null && \
eval $ANDROID_SDK_INSTALL_COMPONENT '"platforms;android-25"' > /dev/null && \
chmod -R a+rwx "$ANDROID_HOME"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vse govoryat "AZ, a kak podnyat prava?"
"Kuda nazhat'?" - "Ch-Ch-cmod tri topora"
Sudo ne pishu, paroli ne vvozhu.
Vse, chto podnyal vchera -
Paru instansov dokera.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suk ubil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ u
miss u

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants