This repository has been archived by the owner on Jul 3, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 71
Frames with mixed types: Determine a column's type? Cast from Any to a more specific type? #86
Comments
JeffreyBenjaminBrown
changed the title
How to determine column types in a Frame with mixed types?
Frames with mixed types: how to determine a column's type, how to cast from Any to a more specific type
Oct 27, 2019
Continuing the previous example, suppose we have the following mixed-type data frame: import org.saddle.io._
val u = org.saddle.Series( 0,1,2 )
val v = org.saddle.Series( "0","1","2" )
val f = org.saddle.Frame(
"u" -> u . asInstanceOf[ org.saddle.Series[Int,Any] ],
"v" -> v . asInstanceOf[ org.saddle.Series[Int,Any] ] ) and we'd like to extract a column from it. The following will work: val u2 : org.saddle.Series[Int,Any] =
f . col("u") . colAt(0) But that's unsafe, because it uses the signature This, on the other hand, doesn't work: val u3 : org.saddle.Series[Int,Int] = ( f
. col("u") . colAt(0) ) It generates the following error: error: type mismatch;
found : org.saddle.Series[Int,Any]
required: org.saddle.Series[Int,Int]
Note: Any >: Int, but class Series is invariant in type T.
You may wish to define T as -T instead. (SLS 4.5)
. col("u") . colAt(0) )
^
org.saddle.Series[Int,Any] <: org.saddle.Series[Int,Int]?
false If a |
JeffreyBenjaminBrown
changed the title
Frames with mixed types: how to determine a column's type, how to cast from Any to a more specific type
Frames with mixed types: Determine a column's type? Cast from Any to a more specific type?
Oct 27, 2019
Solved. Not beautiful, but maybe good enough to keep me out of trouble. Determine the type of a cell: data.colAt(0).raw(0).getClass Cast a column of import org.saddle._
val u = org.saddle.Series(0,1,2)
val v = org.saddle.Series("a","b","c")
val data = org.saddle.Frame(
u . asInstanceOf[ org.saddle.Series[Int,Any] ],
v . asInstanceOf[ org.saddle.Series[Int,Any] ] )
def getNumericCol[A,B] (
// Unsafe -- could be called where it makes no sense
f : org.saddle.Frame[A,B,Any],
i : B )
: org.saddle.Series[A,Int] = {
f . col(i) . colAt(0) .
asInstanceOf[ org.saddle.Series[A,Int] ] }
getNumericCol(data,0) // Works!
getNumericCol(data,1) // "Successfully" called where it makes no sense.
// I wish it threw an error. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
How to determine column types in a Frame with mixed types?
(Here's another issue on heterogeneous data.)
After some head-scratching I managed to create a data frame with mixed types:
Suppose one were to discover (say, months after forgetting why it was made) a mysterious
Frame
with such mixed types. Without retracing the code that generates it, how could one determine the type of its columns?For instance, in the value
f
defined above, the elements at (0,0) and (0,1) look indistinguishable to me:But one is a number and the other is a string.
I might be hoping for something like the
dtype
anddtypes
methods from Python's Pandas. (I searched the codebase, didn't find the string "dtype".) Among other uses, that would let someone verify that the contents of a column are all of the same type -- violations of which seem both easy to achieve and difficult to debug.The text was updated successfully, but these errors were encountered: