Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to add logicalType to my json schema #556

Open
vishwaratna opened this issue Aug 28, 2023 · 0 comments
Open

Unable to add logicalType to my json schema #556

vishwaratna opened this issue Aug 28, 2023 · 0 comments

Comments

@vishwaratna
Copy link

vishwaratna commented Aug 28, 2023

Hi There,

I have developed a compact application that enables the conversion of JSON to Parquet file. I am seeking guidance on how to incorporate logicalType into the schema of my application. While I can use convertedType, the predecessor of logicalType, based on the examples and use-cases provided in the library, I would like to adhere to the latest standards by utilizing logicalType. Can you please advise on how to proceed or provide suggestions?

Below is my sample code using logicalType and it does not works, is there any way to define logicalType in my json schema.

package functions

import (
	"fmt"
	"github.com/xitongsys/parquet-go-source/local"
	"github.com/xitongsys/parquet-go/parquet"
	"github.com/xitongsys/parquet-go/writer"
	"log"
)

type ProcessData struct {
	EventTimestamp string `json:"event_timestamp"`
	ActionName     string `json:"action_name"`
	SystemName     string `json:"system_name"`
}

var jsonSchema string = `{
		"Tag": "name=parquet_go_root, repetitiontype=REQUIRED",
		"Fields": [
		  {"Tag": "name=actionName, inname=ActionName, type=BYTE_ARRAY, logicaltype=STRING, repetitiontype=REQUIRED"},
		  {"Tag": "name=systemName, inname=SystemName, type=BYTE_ARRAY, logicaltype=STRING, repetitiontype=REQUIRED"},
		  {"Tag": "name=eventTimestamp, inname=EventTimestamp, type=BYTE_ARRAY, logicaltype=STRING, repetitiontype=REQUIRED"}
		]
	  }`

func ConvertToParquet() {
	var err error
	fw, err := local.NewLocalFileWriter("./json_schema.parquet")
	if err != nil {
		log.Println("Can't create local file", err)
		return
	}

	//write
	pw, err := writer.NewParquetWriter(fw, jsonSchema, 4)
	if err != nil {
		log.Println("Can't create parquet writer", err)
		return
	}

	pw.RowGroupSize = 128 * 1024 * 1024 //128M
	pw.CompressionType = parquet.CompressionCodec_SNAPPY
	num := 10
	for i := 0; i < num; i++ {
		stu := ProcessData{
			//some data
		}
		if err = pw.Write(stu); err != nil {
			fmt.Println("line 92")
			log.Println("Write error", err)
		}
	}
	if err = pw.WriteStop(); err != nil {
		log.Println("WriteStop error", err)
		return
	}
	log.Println("Write Finished")
	fw.Close()
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant