write_sparse_tensor_to_spmatrix

### System Information
- **Factorization Machine**:

I am successfully using the function `write_spmatrix_to_sparse_tensor` to transform my data from a sparse matrix to the recordio format expected by Sagemaker's factorization machine implementation. 

Example:
```
def write_recordio(array, y, prefix, f):
    # Convert to record protobuf
    buf = io.BytesIO()
    smac.write_spmatrix_to_sparse_tensor(array=array, file=buf, labels=y)
    buf.seek(0)
    
    fname = os.path.join(prefix, f)
    boto3.Session().resource('s3').Bucket('bucket').Object(fname).upload_fileobj(buf)
```
An example of `array` which are features:
```
  (0, 990290)	1.0
  (0, 1266265)	1.0
  (1, 560338)	1.0
  (1, 1266181)	1.0
  (2, 182872)	1.0
  (2, 1266205)	1.0
.................................
```
An example of `y` which is my target:
`[1. 2. 1. ... 3. 1. 5.]`

`write_spmatrix_to_sparse_tensor` does the job. After training my model, I then use Batch Transform to receive a `.out` file with many outputs of type `<class 'record_pb2.Record'>`
An example of one input and associated output record:
input:
```
features {
  key: "values"
  value {
    float32_tensor {
      values: 1.0
      values: 1.0
      keys: 990290
      keys: 1266265
      shape: 1266394
    }
  }
}
label {
  key: "values"
  value {
    float32_tensor {
      values: 1.0
    }
  }
}
```

output:
```
label {
  key: "score"
  value {
    float32_tensor {
      values: 1.5246734619140625
    }
  }
}
```

So now I have a file I originally wrote using `write_spmatrix_to_sparse_tensor` and an output from  `transformer.transform` and I would like the function `write_sparse_tensor_to_spmatrix` to exist for both of these files (my original recordio file used for training and the output `.out` file). I personally need to get back to my original parquet format as my data pull pipeline is parquet -> pandas -> sparse_matrix -> recordio and I need to reverse that process for evaluation and eventually deployment, but no matter what the use case it seems that users would frequently want to work back to their original format from both their input to the model and the output of batch transform and `write_sparse_tensor_to_spmatrix` would accomplish the task. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

write_sparse_tensor_to_spmatrix #1023

System Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

write_sparse_tensor_to_spmatrix #1023

Description

System Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions