Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slower rows iterator on server than local machine #1892

Closed
a-yahaha opened this issue May 6, 2024 · 1 comment
Closed

Slower rows iterator on server than local machine #1892

a-yahaha opened this issue May 6, 2024 · 1 comment

Comments

@a-yahaha
Copy link

a-yahaha commented May 6, 2024

本地测试代码

测试代码如下:

func main() {
	fmt.Println("hello")
	readExcel("/Users/bytedance/Downloads/excel/测试数据100列4500行.xlsx")
}
// 模拟excel读取数据
func readExcel(path string) {
    options := excelize.Options{
       RawCellValue:     false,
       ShortDatePattern: "yyyy-mm-dd",
       LongTimePattern:  "HH:mm:ss",
       LongDatePattern:  "yyyy-mm-dd HH:mm:ss",
       CultureInfo:      excelize.CultureNameZhCN,
    }
    excel, err := excelize.OpenFile(path, options)
    if err != nil {
       fmt.Println("open excel failed", err)
       return
    }
    totalRows, _ := GetRowsCountIterator(excel, "Sheet1")
    if totalRows == 0 {
       fmt.Println("GetRowsCountIterator is empty")
       return
    }
}
func GetRowsCountIterator(excel *excelize.File, sheetName string) (int64, error) {
	// 获取时间才 1.1s
	startTime := time.Now()
	defer func() {
		if err := excel.Close(); err != nil {
			fmt.Printf("excel GetRowsCount file close error:%+v", err)
		}
		fmt.Printf("GetRowsCountIterator cost: %v \n", time.Since(startTime))
	}()
	rows, err := excel.Rows(sheetName)
	if err != nil {
		fmt.Println("GetRowsCountIterator failed", err)
		return 0, err
	}
	var totalRows int64
	for rows.Next() {
		row, err := rows.Columns(excelize.Options{
			RawCellValue: false,
		})
		if err != nil {
			break
		}
		if len(row) > 0 {
			totalRows++
		}
	}
	if err = rows.Close(); err != nil {
		fmt.Printf("excel GetRowsCountIterator rows close  error:%+v", err)
		return 0, err
	}
	fmt.Printf("GetRowsCountIterator total: %d \n", totalRows)
	return totalRows, nil
}

耗时

GetRowsCountIterator cost: 1.156624625s 

服务器上测试

代码如下

// GetRowsCount retrieves the total rows count from the excelize iterator.
func GetRowsCount(f *excelize.File, sheetName string) (int64, error) {
	defer func() {
		if err := f.Close(); err != nil {
			fmt.Printf("excel GetRowsCount file close error:%+v", err)
		}
	}()
	var totalRows int64
	startTime := time.Now()
	rows, err := f.Rows(sheetName)
	fmt.Printf("[costTime]excel GetRowsCount Rows cost: %v", time.Since(startTime))
	if err != nil {
		fmt.Printf("excel GetRowsCount getRows error:%+v", err)
		return 0, err
	}
	startTime = time.Now()
	for rows.Next() {
		row, err := rows.Columns(excelize.Options{
			RawCellValue: false,
		})
		if err != nil {
			break
		}
		if len(row) > 0 {
			totalRows++
		}
	}
	fmt.Printf("[costTime]excel GetRowsCount Rows.next cost: %v", time.Since(startTime))
	startTime = time.Now()
	if err = rows.Close(); err != nil {
		fmt.Printf("excel GetRowsCount rows close  error:%+v", err)
		return 0, err
	}
	fmt.Printf("[costTime]excel GetRowsCount Rows.close cost: %v", time.Since(startTime))
	return totalRows, nil
}

耗时

[costTime]excel GetRowsCount Rows.next cost: 6.809400376s

服务器的配置 8G 16G

问题

初步排查是 rows.Columns在服务端的性能很差,基本耗时都在这里。

有几个疑问

  1. 是否只能通过行迭代器获取excel中sheet的行总数
  2. Columns的性能为什么有这么大的差异,是否有其他优化的方法

测试文件

测试文件: 100列4500行文本

测试数据100列4500行.xlsx

@xuri xuri changed the title 使用行迭代器本地和服务器的性能相差很大 Slower rows iterator on server than local machine May 11, 2024
@xuri
Copy link
Member

xuri commented May 11, 2024

Thanks for your issue. Is your local machine environment different from your server? The performance is also impacted by these factors: OS, Go language version, Excelize library version, hardware such as CPU, memory, storage equipment (HDD or SSD), etc. I don't have enough information to reproduce this issue, but here are some performance figures for your reference: Performance Figures. I'll close this issue, if you have any questions, please let me know to reopen this anytime.

@xuri xuri closed this as completed May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants