Tradução: arrow.qmd #60

scopinho · 2023-11-23T13:34:00Z

Tradução arrow.qmd.
Pontos importantes:

1-) O capítulo usa o dataset de items retirados das bibliotecas públicas de Seattle que está disponível online em (https://data.seattle.gov/Community/Checkouts-by-Title/tmmm-ytt6).
Precisamos definir se este será incluído no pacote dados e traduzido ou alguma outra estratégia. O problema é que o CSV dele tem 9 GB e está armazenado em um bucket S3 da AWS aqui: "https://r4ds.s3.us-west-2.amazonaws.com/seattle-library-checkouts.csv". Por ora, mantive o codigo sem traducao dos campos, tentando explicar alguns termos ao longo do texto como CheckoutYear (ano da retirada).

2-) Assim como no databases.qmd, usei "roda por trás do dplyr" (dplyr backend) e depois usei backend ao longo do texto.

3-) Para tree-like usei "semelhantes a árvores" (mas achei meio esquisito) , então deixei em inglês na frente tb.

Sugestões são bem-vindas!

beatrizmilz · 2023-11-23T20:15:46Z

@decarvaa

arrow.qmd

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

arrow.qmd

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

scopinho · 2023-11-24T13:55:35Z

@decarvaa , muito obrigado pela ajuda na revisão! Excelente!

decarvaa · 2023-11-26T07:01:59Z

@decarvaa , muito obrigado pela ajuda na revisão! Excelente!

@scopinho , imagina! A tradução tá muito bem feita e aprendo muito fazendo as revisões!

beatrizmilz · 2023-11-28T02:16:54Z

Tradução arrow.qmd. Pontos importantes:

1-) O capítulo usa o dataset de items retirados das bibliotecas públicas de Seattle que está disponível online em (https://data.seattle.gov/Community/Checkouts-by-Title/tmmm-ytt6). Precisamos definir se este será incluído no pacote dados e traduzido ou alguma outra estratégia. O problema é que o CSV dele tem 9 GB e está armazenado em um bucket S3 da AWS aqui: "https://r4ds.s3.us-west-2.amazonaws.com/seattle-library-checkouts.csv". Por ora, mantive o codigo sem traducao dos campos, tentando explicar alguns termos ao longo do texto como CheckoutYear (ano da retirada).

Obrigada @scopinho por documentar isso!!

@rivaquiroga In the book, there are big datasets stored in AWS, and the code points to how the reader can download the data directly from AWS.

I guess, in this case, we can:
1 ) use the dataset as it is (in English) or
2) Translate it locally and upload the CSV file to a similar cloud instance.

What is your opinion? Did you face something like that in the 1ed?

@scopinho What do you think?

arrow.qmd

beatrizmilz · 2023-11-28T02:41:52Z

Oi @scopinho e @decarvaa !
Obrigada pela tradução e primeira revisão.
Algo que não tem relação com a tradução em si: eu já tinha ouvido falar do pacote parquet, mas nunca usei. O capítulo me deixou muito interessada!! Parece muito bom.

Sobre a minha revisão: tem alguns typos mas a maioria são coisas que eu anotei para tentar deixar a leitura mais fluida. Veja o que faz sentido para você! :D

Co-authored-by: Beatriz Milz <42153618+beatrizmilz@users.noreply.github.com>

scopinho

Também achei melhor usar "inspecionar" neste caso do scan.

arrow.qmd

scopinho · 2023-11-28T09:01:53Z

Oi @scopinho e @decarvaa ! Obrigada pela tradução e primeira revisão. Algo que não tem relação com a tradução em si: eu já tinha ouvido falar do pacote parquet, mas nunca usei. O capítulo me deixou muito interessada!! Parece muito bom.

Sobre a minha revisão: tem alguns typos mas a maioria são coisas que eu anotei para tentar deixar a leitura mais fluida. Veja o que faz sentido para você! :D

Valeu @beatrizmilz , já aceitei a revisão. Sobre o arrow: Depois que comecei a usar, uso em quase todos os casos onde tem dados maiores que uns 5GB até uns 400GB. Dicas: 1-) Leia sobre os tipos de dados, pois as vezes precisa fazer algumas conversões, etc 2-) Como ele não tem todas as funções usar com o duckdb salva a pele 3-) Funções de janela (window) tipo lead/lag não rola com ele, entao teria que converter pra tibbble ou criar uma função diferente vetorizada, etc. Good luck!

scopinho · 2023-11-28T09:09:48Z

Tradução arrow.qmd. Pontos importantes:
1-) O capítulo usa o dataset de items retirados das bibliotecas públicas de Seattle que está disponível online em (https://data.seattle.gov/Community/Checkouts-by-Title/tmmm-ytt6). Precisamos definir se este será incluído no pacote dados e traduzido ou alguma outra estratégia. O problema é que o CSV dele tem 9 GB e está armazenado em um bucket S3 da AWS aqui: "https://r4ds.s3.us-west-2.amazonaws.com/seattle-library-checkouts.csv". Por ora, mantive o codigo sem traducao dos campos, tentando explicar alguns termos ao longo do texto como CheckoutYear (ano da retirada).

Obrigada @scopinho por documentar isso!!

@rivaquiroga In the book, there are big datasets stored in AWS, and the code points to how the reader can download the data directly from AWS.

I guess, in this case, we can: 1 ) use the dataset as it is (in English) or 2) Translate it locally and upload the CSV file to a similar cloud instance.

What is your opinion? Did you face something like that in the 1ed?

@scopinho What do you think?

@beatrizmilz , my two cents: I like option 2: Although for the r4ds translation it might not be a big deal leave it as is (english), I see more and more examples where the data comes from the web like in the webscraping chapter and other articles we may want to translate in the future. Hence, if we could have a bucket somewhere where the link wont go away, it cold be a good asset for this and other translation projects.

beatrizmilz · 2023-11-28T12:00:30Z

Tradução arrow.qmd. Pontos importantes:
1-) O capítulo usa o dataset de items retirados das bibliotecas públicas de Seattle que está disponível online em (https://data.seattle.gov/Community/Checkouts-by-Title/tmmm-ytt6). Precisamos definir se este será incluído no pacote dados e traduzido ou alguma outra estratégia. O problema é que o CSV dele tem 9 GB e está armazenado em um bucket S3 da AWS aqui: "https://r4ds.s3.us-west-2.amazonaws.com/seattle-library-checkouts.csv". Por ora, mantive o codigo sem traducao dos campos, tentando explicar alguns termos ao longo do texto como CheckoutYear (ano da retirada).

Obrigada @scopinho por documentar isso!!
@rivaquiroga In the book, there are big datasets stored in AWS, and the code points to how the reader can download the data directly from AWS.
I guess, in this case, we can: 1 ) use the dataset as it is (in English) or 2) Translate it locally and upload the CSV file to a similar cloud instance.
What is your opinion? Did you face something like that in the 1ed?
@scopinho What do you think?

@beatrizmilz , my two cents: I like option 2: Although for the r4ds translation it might not be a big deal leave it as is (english), I see more and more examples where the data comes from the web like in the webscraping chapter and other articles we may want to translate in the future. Hence, if we could have a bucket somewhere where the link wont go away, it cold be a good asset for this and other translation projects.

I'll work on that, and will be back with new info!

beatrizmilz · 2023-11-28T14:02:32Z

cienciadedatos/dados#105

rivaquiroga · 2023-12-05T18:08:56Z

@beatrizmilz, the use AWS is new to the second edition. Did they store the datasets there because they are too big?

scopinho · 2023-12-05T18:31:43Z

@beatrizmilz, the use AWS is new to the second edition. Did they store the datasets there because they are too big?

Hi @rivaquiroga , apologizes for jumping in here, but Beatriz mentioned she is a bit busy, so I hope you don't mind if I share some info I have. Anyway, I believe your guess is correct. The idea with arrow is to use a dataset that is big enough to showcase the technology (bigger than memory) and that csv has 9GB.

beatrizmilz · 2023-12-18T12:17:17Z

Ideia: podemos seguir com esse capítulo com a versão em inglês, e depois que estiver disponível os dados traduzidos, podemos atualizar.
O que acha @scopinho ?

scopinho · 2023-12-18T12:58:40Z

Ideia: podemos seguir com esse capítulo com a versão em inglês, e depois que estiver disponível os dados traduzidos, podemos atualizar. O que acha @scopinho ?

Concordo, pois ao menos ja começamos a disponibilizar para os leitores.

beatrizmilz · 2023-12-18T13:02:27Z

Ideia: podemos seguir com esse capítulo com a versão em inglês, e depois que estiver disponível os dados traduzidos, podemos atualizar. O que acha @scopinho ?

Concordo, pois ao menos ja começamos a disponibilizar para os leitores.

Criei outra issue pra essa tarefa específica (pra não esquecer haha). E vou aceitar esse PR então!

Posso aceitar? Você ainda quer editar algo?

scopinho · 2023-12-18T13:07:48Z

Ideia: podemos seguir com esse capítulo com a versão em inglês, e depois que estiver disponível os dados traduzidos, podemos atualizar. O que acha @scopinho ?

Concordo, pois ao menos ja começamos a disponibilizar para os leitores.

Criei outra issue pra essa tarefa específica (pra não esquecer haha). E vou aceitar esse PR então!

Posso aceitar? Você ainda quer editar algo?

Pode aceitar... no momento é tudo que tenho. Se decidirmos mudar, colocamos outro PR depois. thx

Tradução: arrow

08d78c1

scopinho mentioned this pull request Nov 23, 2023

29 - arrow.qmd #29

Closed

beatrizmilz linked an issue Nov 23, 2023 that may be closed by this pull request

29 - arrow.qmd #29

Closed

beatrizmilz added the Precisa: Revisor(a) label Nov 23, 2023

decarvaa suggested changes Nov 24, 2023

View reviewed changes

scopinho and others added 10 commits November 24, 2023 10:07

Update arrow.qmd

f3b8e6d

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

6e95a7a

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

5fab69e

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

70756a0

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

439504a

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

006cf78

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

b14c986

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

248240c

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

0c78623

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

2ecb91b

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

decarvaa suggested changes Nov 24, 2023

View reviewed changes

arrow.qmd Outdated Show resolved Hide resolved

arrow.qmd Outdated Show resolved Hide resolved

scopinho and others added 6 commits November 24, 2023 10:10

Update arrow.qmd

1b389ac

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

09fa6bf

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

99e1647

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

1ff1635

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Update arrow.qmd

9aaf72f

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

Apply suggestions from code review

42d8d6c

Co-authored-by: Arthur C. Silva <108061108+decarvaa@users.noreply.github.com>

beatrizmilz reviewed Nov 28, 2023

View reviewed changes