Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying this out soon on new LS220DE #4

Closed
whcrg opened this issue Jan 3, 2019 · 128 comments
Closed

Trying this out soon on new LS220DE #4

whcrg opened this issue Jan 3, 2019 · 128 comments

Comments

@whcrg
Copy link

whcrg commented Jan 3, 2019

Hey!
Really nice that this project exists!

Just wanted to give feedback and tell that I'll be testing this out on new LS220DE when it arrives and I find some spare time.

The USB should work normally with debian? The aim is to create wireless home-NAS just because there seems to be no ready made solutions with raid capability floating around. Going with TP-LINK TL-WN821N MINI wlan stick because I read it works nicely with raspberry pi etc using debian based software.

@1000001101000
Copy link
Owner

Great to hear from you!

To answer your question: yes, the usb port works like a normal USB 2.0 port without the normal restrictions of the stock firmware. Assuming the Debian kernel includes the need drivers/etc I imagine it will work just fine.

Let me know how it goes.

@1000001101000
Copy link
Owner

I’m currently working on a simplified version of the installer which handles all of the manual install steps at the end of the installation process (copying over dtb/db files, creating new initramfs, etc).

Would you be interested in giving it a try?

I’d love for someone other than myself to try it before I merge it into master.

@whcrg
Copy link
Author

whcrg commented Jan 3, 2019

Yeah, I am happy to try it out!
It takes probably around two weeks for the hardware to arrive.
You can also email me, j (only the single letter) at kirah dot fi.

@1000001101000
Copy link
Owner

right on. Depending on how things go I may have everything updated and published by then. I'll post an update on this thread next week sometime either with a link to the branch I'm working on or just confirming it's been published.

Thanks again!

@1000001101000
Copy link
Owner

I got the new installer finished and tested faster than I anticipated (I found a better way to do parts of it). It's now published in the repository.

Let me know how it works for you.

@whcrg
Copy link
Author

whcrg commented Jan 11, 2019

Still waiting for the hw to show up, but will be testing soon I hope!

@1000001101000
Copy link
Owner

I just fixed a minor issue with the ls2xx devices, it turns out the ethernet activity led is connected to a different pin than it is for the other models (I accidently had it hardcoded as off). I just commited the changes needed to fix that.

when you get your device make sure to pull the latest changes before you do your install to make sure you get that fix.

@whcrg
Copy link
Author

whcrg commented Jan 18, 2019

Finally it came. Trying to follow instruction exactly to test them also.

Is bootloader stuff etc likely to work if I reuse original firmware made partitions like this?
image

@1000001101000
Copy link
Owner

Excellent!
Yes, I find re-using the existing raid devices from the stock firmware is the easiest way to get started with these devices. The screenshot you attached looks good to me.

@1000001101000
Copy link
Owner

I assume you already know this, but I have some steps in the "Post Install" section that describe how to mount/access the existing data volume once the device is up and running. Let me know if you run into any issues or if something is unclear, I'm always looking for feedback to improve them.

@1000001101000
Copy link
Owner

...I suppose if you just got the device you might not be looking to keep the existing xfs filesystem, if so you don't need to actually perform those steps.

@whcrg
Copy link
Author

whcrg commented Jan 19, 2019

Everything worked perfectly!

(for the install, samba and usb-wifi don't want to cooperate. Samba being a little surprise since it should be quite old and stable stuff)

@1000001101000
Copy link
Owner

Which version are you using? I’ve not had samba isssues on Stretch (other than it being a bit of a pain to configure just being samba), Buster still being in “testing” could have all sorts of problems.

If you get stuck with the usb wifi device we can check if debian includes the driver in their kernel, if not you might need to recompile the kernel to add support.

Let me know how it goes!

@whcrg
Copy link
Author

whcrg commented Jan 20, 2019

Samba is just samba. Fought with it 10 years ago las time, couldn't get the configuration just right and it seems to be as hard today also. I installed Stretch system but installed Samba and its dependencies from testing. My mac sees it and can connect but files won't move and I get cryptic errors in logs. Have to investigate when I find time (big exam tomorrow) and/or fully upgrade to testing to be sure the wrong distro packages are not the problem. I need samba 4.8.0+ to have better apple support. Trying to follow this http://wa.rwick.com/2018/04/08/minimal-ubuntu-time-machine-backup-service/

Also for USB wlan I just got different rev. of the stick than people reporting it to work (have to mail order everything to Latvia). I have thus far compiled 3 different versions of the driver and all of them seem to find the stick bot none of them can get the led to light up or find any networks. Getting different stick now that is sold with a promise it works with Raspberry Pi so should be linux and ARM friendly.

Can it be that fan speed/temp stuff has something wrong in device tree or somewhere? I get four speeds, 5000, 3250, 1500 and 0 reported by pwmconfig and sensors but 3250 seems and sounds slower than 1500. Sadly I don't have a laser RPM monitor to check actual RPM; might also just be the fan motor not liking the 1500 PWM frequency and making more noise.

pwmconfig:

PWM 255 FAN 5000 -- full throttle
PWM 240 FAN 5000
PWM 225 FAN 5000
PWM 210 FAN 5000
PWM 195 FAN 5000
PWM 180 FAN 5000
PWM 165 FAN 3250 -- slow and silent but running
PWM 150 FAN 3250
PWM 135 FAN 3250
PWM 120 FAN 3250
PWM 105 FAN 3250
PWM 90 FAN 3250
PWM 75 FAN 1500 -- in between
PWM 60 FAN 1500
PWM 45 FAN 1500
PWM 30 FAN 1500
PWM 28 FAN 1500
PWM 26 FAN 1500
PWM 24 FAN 1500
PWM 22 FAN 1500
PWM 20 FAN 1500
PWM 18 FAN 1500
PWM 16 FAN 1500
PWM 14 FAN 1500
PWM 12 FAN 1500
PWM 10 FAN 1500
PWM 8 FAN 1500
PWM 6 FAN 1500
PWM 4 FAN 1500
PWM 2 FAN 1500
PWM 0 FAN 0 -- stopped

@whcrg
Copy link
Author

whcrg commented Jan 20, 2019

Forced the fan to run at constant speed with constant small load:
"1500 rpm": reported temperature stabilises around 64°C
"3250 rpm": reported temperature keeps slowly creeping upwards (stopped the test at 75°C)
It seems the fan speeds match what the ear says.

@1000001101000
Copy link
Owner

You're probably right, I never noticed that.

The gpio-fan driver uses the gpio/speed mapping in the device tree and just assumes the fan is going at the listed speed. Fortunately the fan alarm works independently of this and will reliably detect if the fan isn't running.

I got the fan speeds from the original device tree I based these on, at one point I tried using a iphone strobelight app to verify the speeds but it wasn't nearly accurate enough (nor is my vision for that matter). I've been planning to try to measure the speed using an arduino to read the tachometer signal and get accurate measurements. Since the different models use different size fans some of them are almost certainly wrong though it shouldn't affect their function.

I had never considered the values could be out of order, I tried to validate that when the device reported a speed change that the sound changed but I never paid attention to the pitch or temp correlation for that matter.

Now that you've brought this to my attention I'm going to see if I can confirm this for all the devices with fans and re-order the values in the device trees as needed.

@whcrg
Copy link
Author

whcrg commented Jan 20, 2019

Yeah it doesn't really cause any adverse effect, but annoying oscillation when there is some load because the lowest setting and highest are next to each other. First it heats up to run full speed, then it cools a bit and runs at the lowest setting (that the fan control thinks is medium) and heat up again and goes to full speed... I suppose the order of running speeds is easily fixable? The RPM itself is not important but the order of modes kinda is.

The best compromise for now for /etc/fancontrol seems to be setting it always run the medium speed at default and make transition from lowest to highest as small as possible -- no oscillation between modes:

INTERVAL=10
DEVPATH=hwmon0= hwmon1=devices/platform/gpio-fan
DEVNAME=hwmon0=armada_thermal hwmon1=gpio_fan
FCTEMPS= hwmon1/pwm1=hwmon0/temp1_input
FCFANS= hwmon1/pwm1=hwmon1/fan1_input
MINTEMP= hwmon1/pwm1=68
MAXTEMP= hwmon1/pwm1=69
MINSTART= hwmon1/pwm1=50
MINSTOP= hwmon1/pwm1=50
MINPWM= hwmon1/pwm1=50
MAXPWM= hwmon1/pwm1=180

@1000001101000
Copy link
Owner

looking closer it looks like I have the values inverted, 5000rpm seems to stop the fan and set off the alarm, it reads the fan at startup as 0rpm with no alarm (typically it starts at full speed).

this probably just means that the ls200 board is different than the others in this regard (there are several features like that). I'm just a little bothered I hadn't noticed before.

I'm testing a new dtb for the ls220 which appears to correct it and am checking the other devices (the ls420 seems to work as expected at first glance).

I should have the new version up by the end of the day (UTC - 6).

@1000001101000
Copy link
Owner

hmm, retesting with the current dtb on stretch I don't get the issue but I do under buster. I'm very confused now.

are you seeing this using the stretch kernel or the buster kernel?

@1000001101000
Copy link
Owner

it looks like there is a bug of some kind causing the gpio-fan driver to invert the gpio values it sets in the latest kernel, I found the same behavior on all the devices with gpio-fan enabled.

I just pushed new device trees that account for this. when/if this gets fixed in the kernel I'll have to change them back.

To use the updated dtb copy it to /etc/flash-kernel/dtbs/ and then run flash-kernel and reboot.

@whcrg
Copy link
Author

whcrg commented Jan 21, 2019

I am at present running stretch kernel and base system (only Samba and some of its dependencies from testing). For me the pwm=0 really stops the fan and pwm=255 really is full throttle and only the middle values seem to be inverted.

Trying today if I can get 60fps video of the fan with one marked blade to maybe calculate the actual speeds.

@1000001101000
Copy link
Owner

lol, so they are separate issues. Either way I'm glad I know about the 4.19 issue so I can track it.

I tried using a audio meter app to determine the difference in sound frequencies yesterday but that didn't work. I hadn't considered trying to video it, supposedly the iphone does slow motion up to like 240fps, I'll give that a try too.

Let me know what you figure out, if you're right it likely means that the high and low fan pins are reversed in the device tree, if so that's a fairly easy fix though I have to figure out if it's just the ls220 that are wrong or if other models are too.

@whcrg
Copy link
Author

whcrg commented Jan 21, 2019

The video was not helpful with the cameras I have.
Next I removed the grill from back of the box and applied really small piece of tape to one blade that made contact with the four supports of the fan -> repeating noise 4 times per revolution (one a bit different because of the support with fan wires is wider). The tape of course causes little drag but these values should still be pretty valid.

FFT-tachometer app (Strobily):
PWM 255: 3445
PWM 120: 1787
PWM 20: 2604

Measured repeat cycle lenght manually in audacity and calculated RPM:
PWM 255: 3529 (one round 17ms)
PWM 120: 1875 (one round 32ms)
PWM 20: 2609 (one round 23ms)

Strobily apps strobe flashing with the coloured propeller blade seems to confirm the slowest speed. Other speeds my cell phone cannot keep up with the synchronized flashing.

@1000001101000
Copy link
Owner

wow, good work. Seems to confirm that the pins are reversed. I wonder how much the tape slows things down, I'd like to update the settings to reflect more accurate values while I'm at it. looks like 3500 2600 1850 and 0 plus however much the tape slowed things down.

I think I'll push the update to fix the pin order tonight and start working on a way to reproduce your test. I'm eager to discover if this issue affects any other devices as well.

@whcrg
Copy link
Author

whcrg commented Jan 21, 2019

I don't think the drag from the tape (really small, maybe 2mm x 3mm free flap) can have very large effect but of course no idea if it is tens of rpms or hundreds... Probably anyway less than variations between individual fans etc components.

@1000001101000
Copy link
Owner

cool. I'm testing an updated dtb with those pins reversed, it already looks like the temperatures at each speed are making more sense. I'll put your values in for speeds and publish the updated files shortly.

@1000001101000
Copy link
Owner

Alright, the changes have been published. Let me know how it works for you.

@whcrg
Copy link
Author

whcrg commented Jan 22, 2019

Works good! Fan control for ls220d with stretch default kernel works now as intended!

Also it seems that removing the grill improved airflow so much that the fastest fan speed is not needed with 100% CPU load. I have been stressing the CPU for 15min already and cant get even to 70°C with the 2600rpm fan setting, with the grill it took just minutes. Maybe when there is also disk load or warmer days the fan needs to go full speed.

Removing the grill is recommended if the box is in a place where pets or children won't have access to the fan! (and its not dangerous even if they did, only 12V inside and fan has no great power)

@1000001101000
Copy link
Owner

Was your grill removable or did this involve cutting plastic?

I removed the cover from my ls421de and marked one of the fan blades and recorded it using the 240fps slow motion mode of the iphone, I was able to confirm those speeds are also out of order too, the values I got were:

5000: 6 frames per revolution
3250: 12 frames per revolution
1500: 8 frames per revolution

I’m going to repeat that measurement with the ls220, ls410 and ls441 and try to measure those as well.

Thanks again for noticing that!

@whcrg
Copy link
Author

whcrg commented Jan 22, 2019

The grill was integrated to the housing but small sharp wire clippers did the job hotplug style without even shutting down. :P
Bye bye warranty, though the diskless box cost just under 90€

Did I understand correctly, if I want to dist-upgrade to testing, only thing I have to worry about is putting the right dtb for the new kernel to /etc/flash-kernel/dtbs/ before?

@whcrg
Copy link
Author

whcrg commented Feb 1, 2019

Everything seems to just work, time machine is happy with the Samba, sshfs automounting works etc and no reboots/crashes so far! (now I'll have to figure out a new hobby or find new problems :D)

temp1:        +59.5°C  
/dev/sda: SAMSUNG HD403LJ: 32°C
 13:37:20 up 2 days, 1 min,  2 users,  load average: 0.08, 0.10, 0.03
              total        used        free      shared  buff/cache   available
Mem:         247836       46324        9232        9500      192280      184116
Swap:        999420       10668      988752

@whcrg
Copy link
Author

whcrg commented Feb 1, 2019

Already planning the move to the new disks.

  • / #2 big disk to the 2nd slot and Network Shut Down #1 big disk to usb-sata-bridge and the small disk remains in the 1st slot --> should boot stretch from the small disk as usual
  • kexec the installer kernel+initrd -> running from ram disk
  • mount both / and /boot raid devices and and some rm -rf and cp -a to clone the system
  • not forgetting to setup /etc/fstab to the right partition ID:s
  • big disk Network Shut Down #1 to 1st slot and continue where I left

See any problems with this?

@1000001101000
Copy link
Owner

I don't fully understand but if you're doing what I think then you'll need to adjust /etc/fstab to change the uuids to their new values.

if you're using raid 1 (how buffalo sets it up) for / /boot and swap there is an easier way. basically you can clone the partition table onto your new disk (gdisk has a good function for that). then you can just add the new partitins to the existing arrays and wait for the resync

similar to mdadm --grow /dev/md0 -n 3 --add /dev/sdc1

that lets you add the new disk without even rebooting and without any uuid changes.

after that you can remove the obsolete drive and shrink the arrays back down to 2 drives

mdadm --grow /dev/md0 -n 2
##might be --manage

@1000001101000
Copy link
Owner

you can watch the resync process with mdadm --detail /dev/md0

@whcrg
Copy link
Author

whcrg commented Feb 1, 2019

Sounds handy!
I am using the raid 1 devices just as the default firmware made them so root and boot partition sizes should match (googling didn't give a clear answer what would happen to fs if one was "growing" the array with a smaller partition).

@1000001101000
Copy link
Owner

The one real limitation is the partitions must be the same size, or at least the new ones must be >= the old ones.

Your basically duplicating the partition at the block level

@whcrg
Copy link
Author

whcrg commented Feb 1, 2019

Should have somehow saved the grown arrays. Now I have bunch of different arrays each having just one physical device... (but the system migrated to the bigdrive #1 nicely) After booting the pieces of the arrays can't somehow find each other.

Have to research this a bit more. So confusing that mdadm reports totally different UUID:s than fstab etc has (raid device UUID vs. file system UUID)

@whcrg
Copy link
Author

whcrg commented Feb 1, 2019

After readding the #2 bigdisk partitions, putting again sudo mdadm --detail --scan output to /etc/mdadm/mdadm.conf and rerunning flash-kernel again now things look good again even after reboot! No idea what went wrong the first time.

@1000001101000
Copy link
Owner

I've lost track of exactly what you're trying at this stage but here are a couple of notes:

  1. anytime you make a change to /etc/mdadm/mdadm.conf you'll want to generate a new intrd
    update-initramfs -u (this will also trigger flash-kernel)
    this will ensure that your device get the same name on boot
  2. these days you don't have to update mdadm.conf if you're just adding a device to an existing array

personally I use device names rather than uuids for raid arrays in my /etc/fstab, as long as you adhere to #1 above it accomplishes pretty much the same thing and is a little easier to read ( I still use uuid for regular partitions )

@whcrg
Copy link
Author

whcrg commented Feb 1, 2019

Did:
mdadm --grow /dev/md0 -n 3 --add /dev/sdc1 (etc)
(waited for the recovery/sync)
updated /etc/mdadm/mdadm.conf
ran flash-kernel to generate initrd
booted (took really long)
and the arrays where shattered for no apparent reason

Redid the same steps and booted again and now thigs work. The only difference was different physical drive at bay 1 when using mdadm.

@1000001101000
Copy link
Owner

make sure to run a separate "update-intramfs -u" after making mdadm.conf changes.

flash-kernel doesn't generate a new initrd, it just packages the current one into a uImage for uboot. you need a new one with the new mdadm.conf in it for the arrays to be build properly at boot.

same goes for fstab changes, amongst other things. Think about it this way, anything your system will need before it mounts "/" needs to be in the initrd and up to date. In this case to boot it needs to know what device gets mounted as "/" and it needs the mdadm info to know how to assemble it. update-initramfs -u does all this for you but you need to trigger it manually in these cases.

@whcrg
Copy link
Author

whcrg commented Feb 1, 2019

Aaa! That was the problem then. I could have sworn that flash-kernel says something along "generating initrd".

@1000001101000
Copy link
Owner

it's the other way around, generating an initrd triggers flash-kernel

It's super helpful since pretty much anything done by a package that needs a new initrd will trigger update-initramfs which in turn triggers flash-kernel but you have to do the same for anyhting you do manually.

changing raid arrays and installing graphics drivers (any drivers really) are the two that have messed me up in the past

@whcrg
Copy link
Author

whcrg commented Feb 2, 2019

This is what I assumed meant generating the image from scratch Generating initramfs u-boot image... done.. One should not assume things but check :)

No reboot/crash so far but with heavy disk load pages and pages of this http://uvkk.kirah.fi/jotainmuutarandomia/randomfiles/log4.txt started appearing in syslog again with the new disks installed.

dmesg has the same stuff and additionally occasional [ 7817.120304] mvneta d0074000.ethernet eth0: Linux processing - Can't refill

So something to do with memory allocation and the ethernet driver for Armada board and disk stuff.

Trying again if sysctl -w vm.min_free_kbytes=9000 or 12000 will stop these errors.

@whcrg
Copy link
Author

whcrg commented Feb 3, 2019

Seems stable so far!

@1000001101000
Copy link
Owner

Let me know how it goes, I may add that as part of the install process

@whcrg
Copy link
Author

whcrg commented Feb 9, 2019

I think I might have still had some non-correct initrd when I had those error messages. Just to be sure I a week ago regenerated the initrd once more and tried to provoke the errors again by returning the vm.min_free_kbytes setting to default (1844) but no errors so far and everything Just Works(tm)! No matter what kind of load I throw at the box. And it sits in a semi closed cupboard with non-optimal ventilation.

temp1:        +60.3°C  
/dev/sda: ST2000DM008-2FR102: 38°C
 18:15:22 up 6 days, 21:15,  1 user,  load average: 0.07, 0.07, 0.01
              total        used        free      shared  buff/cache   available
Mem:         247836       40412       40880       12628      166544      186912
Swap:        999420       14556      984864

Maybe the problems with Buster arise from something in the init process...

@1000001101000
Copy link
Owner

Have you run into anything else of interest since you got up and running?

Now that Buster RC1 is out I'm planning to take another look at the install process and see if anything needs updating.

@whcrg
Copy link
Author

whcrg commented May 7, 2019

No problems, everything so far has run smoothly with Stretch (and Samba from Buster). Even time machine backups just keep happening full auto.

@1000001101000
Copy link
Owner

I'm going to go ahead and close this issue. Let me know if you run into anything else I should look at in the future!

@whcrg
Copy link
Author

whcrg commented Feb 26, 2022

Back to business, everything worked well for long but now decided to try buster again (as stretch is so old...) random crashes when high load are back.

Last things from syslog, something to do with network driver possibly.
Feb 26 09:00:13 LS220DEAA3 kernel: [ 2191.548736] mvneta d0074000.ethernet eth0: Can't allocate skb on queue 0 Feb 26 09:00:13 LS220DEAA3 kernel: [ 2191.555503] mvneta d0074000.ethernet eth0: Can't allocate skb on queue 0 Feb 26 09:00:13 LS220DEAA3 kernel: [ 2191.562272] mvneta d0074000.ethernet eth0: Can't allocate skb on queue 0

Contemplating if I should try dist-upgrading to bullseys to see if that works.

@1000001101000
Copy link
Owner

If you’re having problems I would probably start with a fresh Bullseye install and compare from a clean starting point.

@whcrg
Copy link
Author

whcrg commented Feb 27, 2022

Very true, that is on my task list for some day anyway.
For now it seems stability has been achieved with installing 5.10 kernel from buster backports! 10h with loads of 10 and no reboots.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants