Skip to content
This repository has been archived by the owner on Jul 12, 2024. It is now read-only.

Using pyrs with monkeytype for type inference #32

Open
flip111 opened this issue Dec 31, 2021 · 0 comments
Open

Using pyrs with monkeytype for type inference #32

flip111 opened this issue Dec 31, 2021 · 0 comments

Comments

@flip111
Copy link

flip111 commented Dec 31, 2021

Hello, i would like to show the results i obtained using pyrs together with monkeytype and the process i followed.

These are the first commands i used:

cd typing
echo 'layout python3' > .envrc
direnv allow
git clone https://github.com/chonyy/fpgrowth_py.git
git clone https://github.com/Instagram/MonkeyType.git
git clone https://github.com/konchunas/pyrs.git
cd MonkeyType
python3 -m pip install -e .
cd ../fpgrowth_py
monkeytype run run.py
monkeytype list-modules # optional step that shows what `apply` accepts
monkeytype apply fpgrowth_py.utils

git clone https://github.com/chonyy/fpgrowth_py.git is here the project that is being converted but it might be any project.

After that i ran

monkeytype apply fpgrowth_py.fpgrowth

But i ran into a problem

  File "/home/flip111/typing/fpgrowth_py/fpgrowth_py/utils.py", line 7, in Node
    def __init__(self, itemName: str, frequency: int, parentNode: Optional[Node]) -> None:
NameError: name 'Node' is not defined

I solved this by temporarily removing the type definition part : Optional[Node], then running monkeytype again and then putting the type definition back.

After that it was time for pyrs

python3 -m pyrs fpgrowth_py/fpgrowth_py/fpgrowth.py > fpgrowth_py/fpgrowth_py/fpgrowth.rs
python3 -m pyrs fpgrowth_py/fpgrowth_py/utils.py > fpgrowth_py/fpgrowth_py/utils.rs
rustfmt fpgrowth_py/fpgrowth_py/fpgrowth.rs
rustfmt fpgrowth_py/fpgrowth_py/utils.rs

rustfmt then complains

» rustfmt fpgrowth_py/fpgrowth_py/utils.rs                                                                                                                   6 files, 435 ins.(+), 268 del.(-)  [14:48:38]
error: unexpected closing delimiter: `}`
  --> /home/flip111/typing/fpgrowth_py/fpgrowth_py/utils.rs:48:1
   |
34 | fn getFromFile<T0, RT>(fname: T0) -> RT {
   |                                         - this opening brace...
...
46 | }
   | - ...matches this closing brace
47 | return (itemSetList, frequency);
48 | }
   | ^ unexpected closing delimiter

Because python source code

def getFromFile(fname):
    itemSetList = []
    frequency = []
    
    with open(fname, 'r') as file:
        csv_reader = reader(file)
        for line in csv_reader:
            line = list(filter(None, line))
            itemSetList.append(line)
            frequency.append(1)

    return itemSetList, frequency

got translated into (i indented this for convenience of reading this post)

fn getFromFile<T0, RT>(fname: T0) -> RT {
    let mut itemSetList = vec![];
    let mut frequency = vec![];
    // with!(open(fname, "r") as file) //unsupported
    {
        let csv_reader = reader(file);
        }
        for line in csv_reader {
            line = line.into_iter().filter(None).collect::<Vec<_>>();
            itemSetList.push(line);
            frequency.push(1);
        }
    }
    return (itemSetList, frequency);
}

There is the unsupported with language construct together with a file open. Also one closing bracket got introduced after let csv_reader = reader(file); for some reason. I manually fixed this into

fn getFromFile<T0, RT>(fname: T0) -> RT {
    let mut itemSetList = vec![];
    let mut frequency = vec![];
    // with!(open(fname, "r") as file) //unsupported
    if let Ok(file) = std::fs::File::open(fname) {
        let csv_reader = reader(file);
        for line in csv_reader {
            line = line.into_iter().filter(None).collect::<Vec<_>>();
            itemSetList.push(line);
            frequency.push(1);
        }
    }
    return (itemSetList, frequency);
}

I previously ran this process without the monkeytype step. After this i was able to make a diff of the resulting rust source code. Here is a diff of the pygrowth.rs file which shows also in Rust there are a lot more concrete types available

9,11c9,11
<     itemName: ST0,
<     count: ST1,
<     parent: ST2,
---
>     itemName: &str,
>     count: i32,
>     parent: Option<Node>,
17c17
<     fn __init__<T0, T1, T2>(&self, itemName: T0, frequency: T1, parentNode: T2) {
---
>     fn __init__(&self, itemName: &str, frequency: i32, parentNode: Option<Node>) {
24c24
<     fn increment<T0>(&self, frequency: T0) {
---
>     fn increment(&self, frequency: i32) {
54c54,58
< fn constructTree<T0, T1, T2, RT>(itemSetList: T0, frequency: T1, minSup: T2) -> RT {
---
> fn constructTree(
>     itemSetList: Vec<Union<Any, Vec<&str>>>,
>     frequency: Vec<Union<Any, i32>>,
>     minSup: f32,
> ) -> Union<(None, None), (Node, HashMap<&str, Vec<Union<i32, Node>>>)> {
92c96,103
< fn updateHeaderTable<T0, T1, T2>(item: T0, targetNode: T1, headerTable: T2) {
---
> fn updateHeaderTable(
>     item: &str,
>     targetNode: Node,
>     headerTable: HashMap<
>         &str,
>         Union<Vec<Option<i32>>, Vec<Option<Union<i32, Node>>>, Vec<Union<i32, Node>>>,
>     >,
> ) {
103c114,122
< fn updateTree<T0, T1, T2, T3, RT>(item: T0, treeNode: T1, headerTable: T2, frequency: T3) -> RT {
---
> fn updateTree(
>     item: &str,
>     treeNode: Node,
>     headerTable: HashMap<
>         &str,
>         Union<Vec<Option<i32>>, Vec<Option<Union<i32, Node>>>, Vec<Union<i32, Node>>>,
>     >,
>     frequency: i32,
> ) -> Node {
113c132
< fn ascendFPtree<T0, T1>(node: T0, prefixPath: T1) {
---
> fn ascendFPtree(node: Node, prefixPath: Vec<Union<Any, &str>>) {
119c138,141
< fn findPrefixPath<T0, T1, RT>(basePat: T0, headerTable: T1) -> RT {
---
> fn findPrefixPath(
>     basePat: &str,
>     headerTable: HashMap<&str, Vec<Union<i32, Node>>>,
> ) -> Union<(Vec<Any>, Vec<Any>), (Vec<Vec<&str>>, Vec<i32>)> {
134c156,161
< fn mineTree<T0, T1, T2, T3>(headerTable: T0, minSup: T1, preFix: T2, freqItemList: T3) {
---
> fn mineTree(
>     headerTable: HashMap<&str, Vec<Union<i32, Node>>>,
>     minSup: f32,
>     preFix: Set<&str>,
>     freqItemList: Vec<Union<Set<&str>, Any>>,
> ) {
151c178
< fn powerset<T0, RT>(s: T0) -> RT {
---
> fn powerset(s: Set<&str>) -> chain {
159c186
< fn getSupport<T0, T1, RT>(testSet: T0, itemSetList: T1) -> RT {
---
> fn getSupport(testSet: Union<Set<&str>, (&str)>, itemSetList: Vec<Vec<&str>>) -> i32 {
168c195,199
< fn associationRule<T0, T1, T2, RT>(freqItemSet: T0, itemSetList: T1, minConf: T2) -> RT {
---
> fn associationRule(
>     freqItemSet: Vec<Set<&str>>,
>     itemSetList: Vec<Vec<&str>>,
>     minConf: f32,
> ) -> Vec<Vec<Union<Set<&str>, f32>>> {
182c213
< fn getFrequencyFromList<T0, RT>(itemSetList: T0) -> RT {
---
> fn getFrequencyFromList(itemSetList: Vec<Vec<&str>>) -> Vec<i32> {

After this i copied the two new source files into a new project

cargo new fpgrowth_rs
cp fpgrowth_py/fpgrowth_py/fpgrowth.rs fpgrowth_rs/src
cp fpgrowth_py/fpgrowth_py/utils.rs fpgrowth_rs/src
cd fpgrowth_rs

I added an import for fpgrowth into src/main.rs

mod fpgrowth;

fn main() {
    println!("Hello, world!");
}

I then tried to fix the source files with clippy

cargo clippy --fix --allow-dirty

Clippy reported the following errors

» cargo clippy --fix --allow-dirty
    Checking fpgrowth_rs v0.1.0 (/home/flip111/typing/fpgrowth_rs)
error[E0433]: failed to resolve: use of undeclared crate or module `fpgrowth_py`
 --> src/fpgrowth.rs:6:5
  |
6 | use fpgrowth_py::utils::*;
  |     ^^^^^^^^^^^ use of undeclared crate or module `fpgrowth_py`
  |
help: there is a crate or module with a similar name
  |
6 | use fpgrowth::utils::*;
  |     ~~~~~~~~

error[E0432]: unresolved imports `collections::defaultdict`, `collections::OrderedDict`
 --> src/fpgrowth.rs:4:19
  |
4 | use collections::{defaultdict, OrderedDict};
  |                   ^^^^^^^^^^^  ^^^^^^^^^^^ no `OrderedDict` in `collections`
  |                   |
  |                   no `defaultdict` in `collections`

error[E0432]: unresolved import `csv`
 --> src/fpgrowth.rs:5:5
  |
5 | use csv::reader;
  |     ^^^ use of undeclared crate or module `csv`

error[E0432]: unresolved import `itertools`
 --> src/fpgrowth.rs:7:5
  |
7 | use itertools::{chain, combinations};
  |     ^^^^^^^^^ use of undeclared crate or module `itertools`

error[E0432]: unresolved import `optparse`
 --> src/fpgrowth.rs:8:5
  |
8 | use optparse::OptionParser;
  |     ^^^^^^^^ use of undeclared crate or module `optparse`

error[E0412]: cannot find type `Set` in this scope
  --> src/fpgrowth.rs:14:11
   |
14 | ) -> (Vec<Set<&str>>, Vec<Vec<Union<Set<&str>, f32>>>) {
   |           ^^^ not found in this scope

error[E0412]: cannot find type `Union` in this scope
  --> src/fpgrowth.rs:14:31
   |
14 | ) -> (Vec<Set<&str>>, Vec<Vec<Union<Set<&str>, f32>>>) {
   |                               ^^^^^ not found in this scope
   |
help: consider importing one of these items
   |
1  | use crate::fpgrowth::collections::btree_set::Union;
   |
1  | use crate::fpgrowth::collections::hash_set::Union;
   |
1  | use std::collections::btree_set::Union;
   |
1  | use std::collections::hash_set::Union;
   |

error[E0412]: cannot find type `Set` in this scope
  --> src/fpgrowth.rs:14:37
   |
14 | ) -> (Vec<Set<&str>>, Vec<Vec<Union<Set<&str>, f32>>>) {
   |                                     ^^^ not found in this scope

error[E0425]: cannot find function `getFrequencyFromList` in this scope
  --> src/fpgrowth.rs:15:21
   |
15 |     let frequency = getFrequencyFromList(itemSetList);
   |                     ^^^^^^^^^^^^^^^^^^^^ not found in this scope

error[E0425]: cannot find function `constructTree` in this scope
  --> src/fpgrowth.rs:17:33
   |
17 |     let (fpTree, headerTable) = constructTree(itemSetList, frequency, minSup);
   |                                 ^^^^^^^^^^^^^ not found in this scope

error[E0425]: cannot find function `mineTree` in this scope
  --> src/fpgrowth.rs:22:9
   |
22 |         mineTree(headerTable, minSup, set(), freqItems);
   |         ^^^^^^^^ not found in this scope

error[E0425]: cannot find function `set` in this scope
  --> src/fpgrowth.rs:22:39
   |
22 |         mineTree(headerTable, minSup, set(), freqItems);
   |                                       ^^^ not found in this scope

error[E0425]: cannot find function `associationRule` in this scope
  --> src/fpgrowth.rs:23:21
   |
23 |         let rules = associationRule(freqItems, itemSetList, minConf);
   |                     ^^^^^^^^^^^^^^^ not found in this scope

error[E0425]: cannot find function `getFromFile` in this scope
  --> src/fpgrowth.rs:28:36
   |
28 |     let (itemSetList, frequency) = getFromFile(fname);
   |                                    ^^^^^^^^^^^ not found in this scope

error[E0425]: cannot find function `constructTree` in this scope
  --> src/fpgrowth.rs:30:33
   |
30 |     let (fpTree, headerTable) = constructTree(itemSetList, frequency, minSup);
   |                                 ^^^^^^^^^^^^^ not found in this scope

error[E0425]: cannot find function `mineTree` in this scope
  --> src/fpgrowth.rs:35:9
   |
35 |         mineTree(headerTable, minSup, set(), freqItems);
   |         ^^^^^^^^ not found in this scope

error[E0425]: cannot find function `set` in this scope
  --> src/fpgrowth.rs:35:39
   |
35 |         mineTree(headerTable, minSup, set(), freqItems);
   |                                       ^^^ not found in this scope

error[E0425]: cannot find function `associationRule` in this scope
  --> src/fpgrowth.rs:36:21
   |
36 |         let rules = associationRule(freqItems, itemSetList, minConf);
   |                     ^^^^^^^^^^^^^^^ not found in this scope

warning: unused import: `std::collections::HashMap`
 --> src/fpgrowth.rs:1:5
  |
1 | use std::collections::HashMap;
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: unnecessary parentheses around assigned value
  --> src/fpgrowth.rs:16:18
   |
16 |     let minSup = (itemSetList.len() * minSupRatio);
   |                  ^                               ^
   |
   = note: `#[warn(unused_parens)]` on by default
help: remove these parentheses
   |
16 -     let minSup = (itemSetList.len() * minSupRatio);
16 +     let minSup = itemSetList.len() * minSupRatio;
   | 

warning: unnecessary parentheses around assigned value
  --> src/fpgrowth.rs:29:18
   |
29 |     let minSup = (itemSetList.len() * minSupRatio);
   |                  ^                               ^
   |
help: remove these parentheses
   |
29 -     let minSup = (itemSetList.len() * minSupRatio);
29 +     let minSup = itemSetList.len() * minSupRatio;
   | 

error[E0277]: cannot multiply `usize` by `f32`
  --> src/fpgrowth.rs:16:37
   |
16 |     let minSup = (itemSetList.len() * minSupRatio);
   |                                     ^ no implementation for `usize * f32`
   |
   = help: the trait `std::ops::Mul<f32>` is not implemented for `usize`

error[E0308]: mismatched types
  --> src/fpgrowth.rs:31:23
   |
27 |   fn fpgrowthFromFile<T0, T1, T2, RT>(fname: T0, minSupRatio: T1, minConf: T2) -> RT {
   |                                   -- this type parameter
...
31 |       if fpTree == None {
   |  _______________________^
32 | |         println!("{:?} ", "No frequent item set");
33 | |     } else {
   | |_____^ expected type parameter `RT`, found `()`
   |
   = note: expected type parameter `RT`
                   found unit type `()`

error[E0308]: mismatched types
  --> src/fpgrowth.rs:37:16
   |
27 | fn fpgrowthFromFile<T0, T1, T2, RT>(fname: T0, minSupRatio: T1, minConf: T2) -> RT {
   |                                 -- this type parameter                          -- expected `RT` because of return type
...
37 |         return (freqItems, rules);
   |                ^^^^^^^^^^^^^^^^^^ expected type parameter `RT`, found tuple
   |
   = note: expected type parameter `RT`
                       found tuple `(std::vec::Vec<_>, _)`

Some errors have detailed explanations: E0277, E0308, E0412, E0425, E0432, E0433.
For more information about an error, try `rustc --explain E0277`.
warning: `fpgrowth_rs` (bin "fpgrowth_rs" test) generated 3 warnings
error: could not compile `fpgrowth_rs` due to 21 previous errors; 3 warnings emitted
warning: build failed, waiting for other jobs to finish...
warning: `fpgrowth_rs` (bin "fpgrowth_rs") generated 3 warnings (3 duplicates)
error: build failed

I have yet to inspect these errors and figure out whether they best be fixed before or after using pyrs


Conclusion:

@flip111 flip111 changed the title Using monkeytype with pyrs Using pyrs with monkeytype Dec 31, 2021
@flip111 flip111 changed the title Using pyrs with monkeytype Using pyrs with monkeytype for type inference Dec 31, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant