Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upnewftp.epa.gov #279
Comments
mxplusb
added
EPA
In Progress
labels
Jan 26, 2017
mxplusb
added this to the January milestone
Jan 26, 2017
This comment has been minimized.
This comment has been minimized.
Plazmaz
commented
Jan 26, 2017
|
Looks like http://newftp.epa.edu/ is down |
This comment has been minimized.
This comment has been minimized.
mheistermann
commented
Jan 26, 2017
|
@Plazmaz it's ftp://newftp.epa.gov/ |
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Jan 26, 2017
|
Updated current download status. |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
tried wget but it stopped because of login issues |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
--11:42:18-- ftp://newftp.epa.gov/EPADataCommons/ unlink: No such file or directory FINISHED --11:42:18-- |
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Jan 26, 2017
|
@JeremiahCurtis Give it another try. That happens every so often. These still need to be downloaded. The RSEI directory looks daunting - might split that up a bit. |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
ftp://newftp.epa.gov/RSEI/Version233_RY2012/Aggregated_Grid_Cell_Data/ working on the above csv files; since wget is having problems, i am doing direct downloads |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
this may take awhile |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
direct download not working either...not sure what's up |
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Jan 26, 2017
•
|
It appears the server is gone. ftp://ftp.epa.gov is still up |
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Jan 26, 2017
|
Final data count: |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
now the direct download is working again... |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
there are 3 massive csv files at ftp://newftp.epa.gov/RSEI/Version233_RY2012/Disaggregated_Microdata/ each is about 110 GB |
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Jan 26, 2017
|
@JeremiahCurtis Pull down whatever you can - I'm unable to access |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
working on it |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
direct download is kind of ineffective for a 110 GB file, though. If my browser crashes, I have to start all over....any ideas? |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
I'm also running downthemall on thousands of files from a lot of the directories at http://cdiac.ornl.gov/ftp/ |
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Jan 26, 2017
|
Using wget might be good idea. The download rates are limited to about 500kb/s |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
ftp://newftp.epa.gov/RSEI/Version233_RY2012/Disaggregated_Microdata/ |
This comment has been minimized.
This comment has been minimized.
lgreenlee
commented
Jan 26, 2017
|
I'm looking at this - it looks like the server is reaching its connection
limits. Aria2 might be a good option for fast downloads.
…On Thu, Jan 26, 2017 at 12:28 PM JeremiahCurtis ***@***.***> wrote:
ftp://newftp.epa.gov/RSEI/Version233_RY2012/Disaggregated_Microdata/
is anyone else able to access?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#279 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABr2HYKYAfGGIohvuAjThgOPDps0ImHzks5rWNe4gaJpZM4LuVFM>
.
|
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Jan 26, 2017
|
I think I've hit my connection limits - I've got to bow out. I've got some amount of data that I can pass off to anyone - or I am happy to grab data from someone who downloaded to try and host the data somewhere. |
This comment has been minimized.
This comment has been minimized.
adinbied
commented
Jan 26, 2017
|
While it's not DOWN for me, it's requiring a username and password to connect. |
This comment has been minimized.
This comment has been minimized.
lrehmann
commented
Jan 26, 2017
|
The server is responding with 421 Maximum login limit has been reached Various clients give different messages when the server cannot be reached with the default anonymous credentials. Chrome asks for a username and password when in fact the anonymous credentials are still valid, the server is just overwhelmed. |
This comment has been minimized.
This comment has been minimized.
adinbied
commented
Jan 26, 2017
|
OK, didn't know that. Thanks! |
This comment has been minimized.
This comment has been minimized.
ecoquant
commented
Jan 26, 2017
|
We have that subdirectory mirrored along with *cdiac.ornl.gov*. That
subdirectory by itself has about 87 Gb. This is tracked as The Azimuth
Backup Project Issue #3. It was one of the first we did.
To everyone, I would not, however, rely upon single copies. It would be
good to know someone else has it, too, or could replicate ours
elsewhere.
On Thu, Jan 26, 2017, at 12:13, JeremiahCurtis wrote:
I'm also running downthemall on thousands of files from a lot of the
directories at http://cdiac.ornl.gov/ftp/
This doesn't help direct download speeds, but if someone can confirm
that the above ftp has been completely mirrored, I will end the dta
session and that should speed up direct download.....thanks
— You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub[1], or mute the
thread[2].
|
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Jan 26, 2017
|
what is the cdiac ftp mirror address? i followed the link on the main cdiac issue page here, and could not actually find any data.....maybe i'm missing something....thanks |
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Jan 26, 2017
|
Given that this data source is going to be taken down at anytime (and that the source is crazy slow), I think priority one should be downloading it - even if it's spread across multiple people. We can consolidate and duplicate later. |
This comment has been minimized.
This comment has been minimized.
randomvariable
commented
Jan 26, 2017
|
Started a sync of ftp://newftp.epa.gov/RSEI/Version233_RY2012/Disaggregated_Microdata/ at about 500KB/s |
This comment has been minimized.
This comment has been minimized.
gofrogs2013
commented
Jan 30, 2017
•
|
I'm curious if it would be worthwhile to try to make a FOIA request for this information as I'm having the same issue with slow downloads and we could get it on a hard drive or similar, albeit with a fee. The entire dataset could be sent on a 2 TB external HD. |
This comment has been minimized.
This comment has been minimized.
|
@gofrogs2013 Good idea |
This comment has been minimized.
This comment has been minimized.
|
Can someone volunteer to coordinate this issue? It's great that so many people are dividing it up to get it done! If one of you could track who has what that would be really helpful. Thanks! |
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Jan 31, 2017
|
I've suffered an untimely hard drive failure, I gotta back out. Sorry. |
This comment has been minimized.
This comment has been minimized.
gofrogs2013
commented
Jan 31, 2017
•
|
I went ahead and made a FOIA request for all data in the newftp folder. You can check the progress here: https://foiaonline.regulations.gov/foia/action/public/view/request?objectId=090004d281137e25 |
This comment has been minimized.
This comment has been minimized.
gofrogs2013
commented
Feb 1, 2017
|
@bkirkbri Per the previous comment, I've made the FOIA request and added a link. I won't be able to coordinate it beyond that if we still want to try downloading the rest of it (which is probably the case) as I'll be working on NASA ERS files #289 for a while, but I'll post here if they approve the request. |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Feb 3, 2017
|
@randomvariable How is the microdata folder moving? I am attempting a grab of the following RSEI subfolders: temp and shapefiles |
This comment has been minimized.
This comment has been minimized.
donbright
commented
Feb 4, 2017
•
|
fyi for anyone trying to look at @empirical-bayesian issue links, they actually refer to https://bitbucket.org/azimuth-backup/azimuth-inventory/issues/89 not the automatically generated github issues (like this #89) |
This comment has been minimized.
This comment has been minimized.
StephWo
commented
Feb 4, 2017
|
I'm trying to get those Microdata files. I started with the last one in alphabetical order (Micro2012_2012...) and will go backwards from that. ETA for the first file is in 9 days... |
This comment has been minimized.
This comment has been minimized.
gofrogs2013
commented
Feb 7, 2017
|
@BauerPiepenbrink Is your download still going, and if so do you have the same ETA? Hopefully it will be possible to download these large files, but if not I will try getting them from the agency via FOIA as I mention above. |
This comment has been minimized.
This comment has been minimized.
donbright
commented
Feb 7, 2017
|
I just checked and ./AIR_QUALITY_DATA only has 58M of data in a single .zip file, which is far less than what @Serubin reported above. does anyone have a public mirror up for cross-checking data? |
This comment has been minimized.
This comment has been minimized.
StephWo
commented
Feb 7, 2017
|
@gofrogs2013 steady as a rock. ETA 6d 23h with an average of 130 K/s. It's not fast but reliable so far. A friend of mine and me used to try to calculate what has better bandwith from europe to china. A Gigabit Internet Uplink or a seacontainer full of Hard-Drives. |
This comment has been minimized.
This comment has been minimized.
StephWo
commented
Feb 7, 2017
|
I shouldn't have jinxed it. Got Interrupted by the server half an hour ago. Continueing now. |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Feb 13, 2017
|
@Serubin hope your hard drive failure doesn't mean your download is irretrievable :) |
This comment has been minimized.
This comment has been minimized.
StephWo
commented
Feb 15, 2017
•
|
So, first file from the Disaggregated_Microdata folder is finally downloaded. Its Micro2012_2012.csv http://176.9.83.61/InProgress_279/Disaggregated_Microdata/ Hashdeep Checksum for that single file: 110831639138,1d94bea31fe0bd03d732e01b7e7d6ab8,9087314828d9736e275d395f749b354676f7f4164a003319c3501257053b8366,Micro2012_2012.csv |
This comment has been minimized.
This comment has been minimized.
gofrogs2013
commented
Feb 15, 2017
|
Disregard the referenced issue above. |
This comment has been minimized.
This comment has been minimized.
StephWo
commented
Feb 16, 2017
•
|
@gofrogs2013 well, I won't try to open it in whole :) So, as the file extension promised, comma seperated values. If someone really wants to dig into that there seems to be a software for that to basically filter the csv files called Microdata_Extractor. I will try to download that too if I stumble upon it. |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Feb 16, 2017
•
|
just wondering what we're still missing on the RSEI folder; I have finished: Version233_RY2012/Public_Release_Data/CSV version/ |
This comment has been minimized.
This comment has been minimized.
Serubin
commented
Feb 20, 2017
|
@JeremiahCurtis Still working on retrieval. Picked up another 4TB drive so I should be able to get back to data pulling soon. |
This comment has been minimized.
This comment has been minimized.
gofrogs2013
commented
Feb 20, 2017
|
@BauerPiepenbrink Are you trying to download the 2010/11 csv files as well? Maybe you and @Serubin could each do one. |
This comment has been minimized.
This comment has been minimized.
StephWo
commented
Feb 20, 2017
•
|
@gofrogs2013 As I said, Im going in reverse order, so I'm at Micro2012_2011.csv right now (56GB downloaded, 4 days to go). |
This comment has been minimized.
This comment has been minimized.
gofrogs2013
commented
Feb 22, 2017
|
For the record, I withdrew the FOIA request as the downloads are working. I won't be able to do 2010 myself but perhaps another user here could grab it. |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Feb 22, 2017
•
|
there are also three huge files under ftp://newftp.epa.gov/RSEI/Version234_RY2014/Disaggregated_Microdata/ I'm running IDM on the three files at ftp://newftp.epa.gov/RSEI/Version235_RY2015/Disaggregated_Microdata/ , with a transfer rate about 700-800 KB/s when running all three simultaneously.......ETA: 4-5 days for the set Update: I had my ISP increase my speed to 10 Mbps/s, and so I'm running these at a combined 1.2-1.3 MB/s right now ETA for the trio: less than 3 days |
This comment has been minimized.
This comment has been minimized.
JeremiahCurtis
commented
Mar 2, 2017
•
|
Finished the three huge files at ftp://newftp.epa.gov/RSEI/Version235_RY2015/Disaggregated_Microdata/........will attempt ftp://newftp.epa.gov/RSEI/Version234_RY2014/Disaggregated_Microdata/ after I finish a few more NCEI folders I'm up to 20 Mbps service, looking into gbps but not sure if I can afford it local mirror of ftp://newftp.epa.gov/RSEI/Version235_RY2015/Disaggregated_Microdata |
This comment has been minimized.
This comment has been minimized.
StephWo
commented
Mar 6, 2017
|
I finished |
This comment has been minimized.
This comment has been minimized.
gofrogs2013
commented
Mar 8, 2017
|
Awesome work @BauerPiepenbrink ! |
This comment has been minimized.
This comment has been minimized.
StephWo
commented
Apr 2, 2018
|
Be advised: 162 Find all these datasets at http://176.9.83.62 or http://climatemirror1.space |
Serubin commentedJan 26, 2017
•
edited
Current ftp contents:
899M ./AIR_QUALITY_DATA
0 ./CAM_HRA
2.3G ./CERCLA108B
406G ./COMPTOX
406G ./Computational_Toxicology_Data (Looks like a duplicate of the above)
2.2G ./EJSCREEN
33G ./EPADataCommons
44G ./GKM_DOCUMENTS
1.0T ./RSEI
7.5G ./RTPGIS
62M ./STANDARD_MINE
1.0K ./TESTAREA
1.9T .
Currently pulled down on my machine:
899M ./AIR_QUALITY_DATA
31M ./GKM_DOCUMENTS
2.2G ./EJSCREEN
14G ./EPADataCommons
2.3G ./CERCLA108B
4.0K ./CAM_HRA
32G ./COMPTOX
52G .
I intend to make my mirror public, but that may have to wait until the weekend.