Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colnames get lost after calling Rapply() #42

Closed
dtynn opened this issue Nov 1, 2017 · 2 comments
Closed

colnames get lost after calling Rapply() #42

dtynn opened this issue Nov 1, 2017 · 2 comments

Comments

@dtynn
Copy link

dtynn commented Nov 1, 2017

Hi alex, thank you for your great work.

I noticed that the column names get lost after calling Rapply during my tests, also the detected types.

test codes:

package main

import (
	"log"

	"github.com/kniren/gota/dataframe"
	"github.com/kniren/gota/series"
)

func main() {
	df := dataframe.LoadRecords(
		[][]string{
			[]string{"A", "B", "C", "D"},
			[]string{"a", "4", "5.1", "true"},
			[]string{"k", "5", "7.0", "true"},
			[]string{"k", "4", "6.0", "true"},
			[]string{"a", "2", "7.1", "false"},
		},
	)

	applied := df.Rapply(func(s series.Series) series.Series {
		return s
	})

	log.Println(df)
	log.Println(applied)
}

output:

2017/11/01 17:38:32 [4x4] DataFrame

    A        B     C        D
 0: a        4     5.100000 true
 1: k        5     7.000000 true
 2: k        4     6.000000 true
 3: a        2     7.100000 false
    <string> <int> <float>  <bool>

2017/11/01 17:38:32 [4x4] DataFrame

    X0       X1       X2       X3
 0: a        4        5.100000 true
 1: k        5        7.000000 true
 2: k        4        6.000000 true
 3: a        2        7.100000 false
    <string> <string> <string> <string>

@dtynn
Copy link
Author

dtynn commented Nov 1, 2017

Hi alex
I read the issues and find this:

We want to be able to apply functions to both rows and columns over a DataFrame. The dimension of the returned Series should be compatible with each other. Additionally, when applying functions over rows, since we can't expect the columns to be all of the same type, we will have to cast the types.

so maybe the output is as expected ?
if so, please close the issue~

@kniren
Copy link
Collaborator

kniren commented Nov 1, 2017

Yeah, when using Rapply you cannot expect the aggregate function to rename the functions for you. The type casting is a necessity as well and working as intended.

If your aggregate function intends to return the same number of rows and you want to keep the column name you should rename the dataframe accordingly.

Thank you for the comment! If you disagree with the current behaviour feel free to continue the discussion here, for the time being I'm closing this issue.

Best,
Alex

@kniren kniren closed this as completed Nov 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants