Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SKD replace scans #16

Closed
funderburkjim opened this issue Dec 15, 2022 · 15 comments
Closed

SKD replace scans #16

funderburkjim opened this issue Dec 15, 2022 · 15 comments

Comments

@funderburkjim
Copy link
Contributor

This note provides instructions for replacing the current skd scans with those provided by @maltenth at #14.

Basic idea is to create a new version of the repository https://github.com/sanskrit-lexicon-scans/skd.
Then @funderburkjim can pull this repository to Cologne and move the pdfpages folder to the spot expected by servepdf application.

page names

Each page should be converted to a pdf.
The file name of each pdf should be consistent with https://github.com/sanskrit-lexicon/csl-websanlexicon/blob/master/v02/distinctfiles/skd/web/webtc/pdffiles.txt, since the servepdf application uses this file to know what page to serve.
For instance page 2-010 (page 10 of 2nd volume of skd) is retrieved by the url
https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/servepdf.php?dict=skd&page=2-010

image

In pdffiles.txt 2-010 is associated with file name pg2_010.pdf at the line
2-010:pg2_010.pdf:kawItala.

@Andhrabharati Once you have a local folder with all the pdf pages appropriately named,
then let us consider in more detail the logistics involved with the sanskrit-lexicon-scans repository.

@Andhrabharati
Copy link

Andhrabharati commented Dec 16, 2022

In pdffiles.txt 2-010 is associated with file name pg2_010.pdf at the line 2-010:pg2_010.pdf:kawItala.

@Andhrabharati Once you have a local folder with all the pdf pages appropriately named, then let us consider in more detail the logistics involved with the sanskrit-lexicon-scans repository.

@funderburkjim
Here are the first pages of the 5 volumes-

pg1_001.pdf
pg2_001.pdf
pg3_001.pdf
pg4_001.pdf
pg5_001.pdf

and the last pages of the resp. volumes-

pg1_315.pdf
pg2_937.pdf
pg3_792.pdf
pg4_565.pdf
pg5_555.pdf

As I had suggested earlier, first tried reducing the size by 25 times and the file quality is not satisfactory; hence resorted to "20 times" reduction instead. [The overall size is about 1.2 GB for the dictionary pages; the intro pages are skipped in this lot!!]

What's the next step to do?

@funderburkjim
Copy link
Contributor Author

I have

  1. renamed sanskrit-lexicon-scans/skd to skd-v0
  2. created a new repository skd
  3. invited you Andhrabharati as a collaborator.

You should get email from github regarding invitation. You should accept the invitation.

When I get notification of your acceptance, we can proceed.

Note: I don't do github management stuff very often.
So there may be a couple of false starts.

Can I assume you have command-line access to github? (Such as via gitbash for windows) ?

If you don't have command line access,, we can go some other route.

@funderburkjim
Copy link
Contributor Author

@Andhrabharati adding this comment so you'll be sure to see the previous comment.

@Andhrabharati
Copy link

I never tried command line access, so not sure about it.

I am using Github Desktop app, and have access to the repo files through it.

@funderburkjim
Copy link
Contributor Author

I don't use Github Desktop, so let's go to plan B.

Put your images in a folder (or zip) somewhere in the cloud. And provide a url that I can use to download.

@Andhrabharati
Copy link

I am sending them to funderberkjim/skd repo.

Thought you wanted them there.

@funderburkjim
Copy link
Contributor Author

That should be ok.
I'll check back in a couple of hours and look at funderburkjim/skd (note spelling: ..burk..)

@Andhrabharati
Copy link

Sorry for my wrong spelling.

Pl. note that I have stored the scan pages in different folders volume-wise (1 to 5) for my convenience, and uploading them as such.

@funderburkjim
Copy link
Contributor Author

I see 5 volumes. Ready for me to work with?

@drdhaval2785
Copy link

drdhaval2785 commented Dec 17, 2022

@funderburkjim

Just a note on Github Desktop. I have worked with both CLI and Desktop app. Desktop app has buttons for add, commit and push. So whatever files one chooses and pushes these buttons will be processed. Outcome is identical.

@Andhrabharati
Copy link

Yes pl. @funderburkjim

@funderburkjim
Copy link
Contributor Author

Am close to end of the MW accent review. It may be several days before I finish skd scan install.

@Andhrabharati
Copy link

Andhrabharati commented Dec 17, 2022

@funderburkjim

I wanted to say that you should cover the annexure pages of MW as well for accents.

You had mentioned in the beginning that you are leaving those pages, as I had 'seen' them in my working last year (1st quarter).

I would like to remind you again that I was to 'work' on the comp.word headers and groups (specifically), but it did not happen as you wanted to do something before I start it. And for some reason, you did not 'inform' if you had done your intended work, for me to finish my part. So, I had covered many other areas/repos at CDSL from that time.

This definitely is a wrong place to say this, but the occassion dictates to post this message here.

@gasyoun
Copy link
Member

gasyoun commented Dec 17, 2022

This definitely is a wrong place to say this, but the occassion dictates to post this message here.

Agree.

I would like to remind you again that I was to 'work' on the comp.word headers and groups (specifically), but it did not happen as you wanted to do something before I start it.

Of utmost interest.

@funderburkjim
Copy link
Contributor Author

New scans replaced at Cologne.

12-28-2022
Replace pdfs, i.e., a new version of SKDScanpdf directory

The old pdfs are in a certain location on the Cologne server, e.g.,
 https://www.sanskrit-lexicon.uni-koeln.de/scans/SKDScan/SKDScanpdf/pg2_340.pdf


User (Andhrabharati) uploaded a new version of the pdfs to some temporary
repository, under the same names (e.g. pg2_340.pdf)
The objective is to replace the old pdfs with the new pdfs on the Cologne
server.

1. Dowmload new images into some temporary directory at Cologne (tempnewscans)
2. mv SKDScanpdf SKDScanpdf-v0  # move old images to new location
3. mv tempnewscans SKDScanpdf   #

Now, 
https://www.sanskrit-lexicon.uni-koeln.de/scans/SKDScan/SKDScanpdf/pg2_340.pdf will display the new image pdf.

File size comparison: new scans about twice as many bytes.
> ls  -all SKDScanpdf/pg2_340.pdf
-rw-r--r-- 1 jfunderb uniuser 386518 29. Dez 01:12 SKDScanpdf/pg2_340.pdf
> ls  -all SKDScanpdf-v0/pg2_340.pdf
-rw------- 1 wwwadm1 uniuser 157614  3. Okt 2013  SKDScanpdf-v0/pg2_340.pdf

Temporarily, for comparison purposes, an old page can be accessed. e.g.,
https://www.sanskrit-lexicon.uni-koeln.de/scans/SKDScan/SKDScanpdf-v0/pg2_340.pdf 

SKDScanpdf-v0 will eventually be removed from Cologne server.

Hope @drdhaval2785 and other fans of SKD will find the new images somewhat better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants